Protocol Documentation

Table of Contents

easy_vision/python/protos/anchor_generator.proto

Top


      
        

AnchorGenerator

Configuration proto for the anchor generator to use in the object detection
pipeline. See core/anchor_generator.py for details.
FieldTypeLabelDescription
grid_anchor_generator GridAnchorGenerator optional
 
ssd_anchor_generator SsdAnchorGenerator optional
 
multiscale_anchor_generator MultiscaleAnchorGenerator optional
 
temporal_grid_anchor_generator TemporalGridAnchorGenerator optional
 
yolo_anchor_generator YOLOAnchorGenerator optional
 

GridAnchorGenerator

Configuration proto for GridAnchorGenerator. See
anchor_generators/grid_anchor_generator.py for details.
FieldTypeLabelDescription
height int32 optional
Anchor height in pixels. Default: 256
width int32 optional
Anchor width in pixels. Default: 256
height_stride int32 optional
Anchor stride in height dimension in pixels. Default: 16
width_stride int32 optional
Anchor stride in width dimension in pixels. Default: 16
height_offset int32 optional
Anchor height offset in pixels. Default: 0
width_offset int32 optional
Anchor width offset in pixels. Default: 0
scales float repeated
List of scales for the anchors. 
aspect_ratios float repeated
List of aspect ratios for the anchors. 

MultiscaleAnchorGenerator

Configuration proto for RetinaNet anchor generator described in
https://arxiv.org/abs/1708.02002. See
anchor_generators/multiscale_grid_anchor_generator.py for details.
FieldTypeLabelDescription
min_level int32 optional
minimum level in feature pyramid Default: 2
max_level int32 optional
maximum level in feature pyramid Default: 6
anchor_scale float optional
Scale of anchor to feature stride 8 * 4 = 32 Default: 8
aspect_ratios float repeated
Aspect ratios for anchors at each grid point. 
scales_per_octave int32 optional
Number of intermediate scale each scale octave Default: 1
normalize_coordinates bool optional
Whether to produce anchors in normalized coordinates. Default: true

SsdAnchorGenerator

Configuration proto for SSD anchor generator described in
https://arxiv.org/abs/1512.02325. See
anchor_generators/multiple_grid_anchor_generator.py for details.
FieldTypeLabelDescription
num_layers int32 optional
Number of grid layers to create anchors for. Default: 6
min_scale float optional
Scale of anchors corresponding to finest resolution. Default: 0.2
max_scale float optional
Scale of anchors corresponding to coarsest resolution Default: 0.95
scales float repeated
Can be used to override min_scale->max_scale, with an explicitly defined
set of scales.  If empty, then min_scale->max_scale is used. 
aspect_ratios float repeated
Aspect ratios for anchors at each grid point. 
interpolated_scale_aspect_ratio float optional
When this aspect ratio is greater than 0, then an additional
anchor, with an interpolated scale is added with this aspect ratio. Default: 1
interpolate_in_all_layers bool optional
When set true, intepolate scale=sqrt(max_size*min_size), aspect_ratios=1.0
in all layers, otherwise lowest layer will be ignored Default: false
reduce_boxes_in_lowest_layer bool optional
Whether to use the following aspect ratio and scale combination for the
layer with the finest resolution : (scale=0.1, aspect_ratio=1.0),
(scale=min_scale, aspect_ration=2.0), (scale=min_scale, aspect_ratio=0.5). Default: true
reduce_boxes_in_larger_layers bool optional
 Default: false
base_anchor_height float optional
The base anchor size in height dimension. Default: 1
base_anchor_width float optional
The base anchor size in width dimension. Default: 1
height_stride int32 repeated
Anchor stride in height dimension in pixels for each layer. The length of
this field is expected to be equal to the value of num_layers. 
width_stride int32 repeated
Anchor stride in width dimension in pixels for each layer. The length of
this field is expected to be equal to the value of num_layers. 
height_offset int32 repeated
Anchor height offset in pixels for each layer. The length of this field is
expected to be equal to the value of num_layers. 
width_offset int32 repeated
Anchor width offset in pixels for each layer. The length of this field is
expected to be equal to the value of num_layers. 

TemporalGridAnchorGenerator

Configuration proto for TemporalGridAnchorGenerator. See
anchor_generators/temporal_grid_anchor_generator.py for details.
FieldTypeLabelDescription
length int32 optional
Anchor length in pixels. Default: 8
stride int32 optional
Anchor stride in height dimension in pixels. Default: 8
offset int32 optional
Anchor height offset in pixels. Default: 0
scales float repeated
List of scales for the anchors. 

YOLOAnchorGenerator

Configuration proto for YOLOAnchorGenerator. See
anchor_generators/yolo_anchor_generator.py for details.
FieldTypeLabelDescription
anchor_group YOLOAnchorGenerator.AnchorGroup repeated
List of Anchor groups, the number of groups must be equal to
the number of feature maps 

YOLOAnchorGenerator.AnchorGroup

List of Anchors in one level feature map
FieldTypeLabelDescription
anchor_size YOLOAnchorGenerator.AnchorSize repeated
 

YOLOAnchorGenerator.AnchorSize

Anchor width and height in pixels
FieldTypeLabelDescription
width int32 required
 
height int32 required
 

easy_vision/python/protos/argmax_matcher.proto

Top


      
        

ArgMaxMatcher

Configuration proto for ArgMaxMatcher. See
matchers/argmax_matcher.py for details.
FieldTypeLabelDescription
matched_threshold float optional
Threshold for positive matches. Default: 0.5
unmatched_threshold float optional
Threshold for negative matches. Default: 0.5
ignore_thresholds bool optional
Whether to construct ArgMaxMatcher without thresholds. Default: false
negatives_lower_than_unmatched bool optional
If True then negative matches are the ones below the unmatched_threshold,
whereas ignored matches are in between the matched and umatched
threshold. If False, then negative matches are in between the matched
and unmatched threshold, and everything lower than unmatched is ignored. Default: true
force_match_for_each_row bool optional
Whether to ensure each row is matched to at least one column. Default: false
use_matmul_gather bool optional
Force constructed match objects to use matrix multiplication based gather
instead of standard tf.gather Default: false

YOLOArgMaxMatcher

Configuration proto for YOLOArgMaxMatcher. See
matchers/argmax_matcher.py for details.
FieldTypeLabelDescription
matched_threshold float optional
Threshold for positive matches. Default: 1
use_matmul_gather bool optional
Force constructed match objects to use matrix multiplication based gather
instead of standard tf.gather Default: false

easy_vision/python/protos/aspp_block.proto

Top


      
        

ASPPBlock



        
          
FieldTypeLabelDescription
image_level_features bool optional
 Default: true
batchnorm_trainable bool optional
 Default: true
weight_decay float optional
 Default: 0
feature_depth int32 required
 
atrous_rates int32 repeated
 
aspp_with_separable_conv bool optional
 Default: true
keep_prob float optional
dropout config, keep_prob of aspp out features Default: 1

easy_vision/python/protos/auto_compression.proto

Top


      
        

CompressConfig

Messages for configuring the strategy for auto compression
FieldTypeLabelDescription
compress_mode string optional
Compression mode: one in [`prune`, `quantize`]. Defualt to be `prune`. Default: prune
is_finetune bool optional
Whether is finetuning from a compressed model. Default: false
speedup_target float optional
The target speedup ratio. Default to 1.0. Default: 1
pretrain_model string required
Path to pretrained model. 
tune_mode string optional
Auto tuning mode.
For `prune`, one in [`RL`, `random`, `uniform`].
For `quantize`, one in [`KL`, `LSQ`] Default: uniform
num_trials int32 optional
Number fo trials for the automatic search of compression strategy. Default: 10
interval_steps int32 optional
Number of steps for re-training steps of each compressed model. Default: 1000
metric_key string required
Metric key to evaluate the model performance. 
metric_mode string optional
Metric mode. Which is better, one on [`bigger`, `smaller`].
Default to be `bigger`. Default: bigger
prune_params CompressConfig.PruneHparams optional
 
quant_params CompressConfig.QuantHparams optional
 

CompressConfig.PruneHparams

Configuration message for hyper-parameter of auto compression under `prune`
FieldTypeLabelDescription
include_scopes string repeated
Graph scopes that would to to be included. 
exclude_scopes string repeated
Graph scopes that would to to be excluded. 
nb_iters_recon int32 optional
Number of iterations for layer-reconstruction. Default to be 1000. Default: 1000
lr_pgd_init float optional
Initial learning rate for layer-selection. Default to be 1e-10. Default: 1e-10
lr_pgd_incr float optional
Learning rate increase ratio for layer-selection. Default to be 1.4. Default: 1.4
lr_pgd_decr float optional
Learning rate decrease ratio for layer-selection. Default to be 0.7. Default: 0.7
lr_adam float optional
Learning rate for layer-reconstruction with Adam. Default to be 1e-4. Default: 0.0001
prunable_types string repeated
List of op types that can be pruned.
Subset of [`Conv2D`, `MatMul`] 
channel_base_mod int32 optional
Base number for remained channels number as a multiple of it. Default to be 4. Default: 4

CompressConfig.QuantHparams

Configuration message for hyper-parameter of auto compression under `quantize`
FieldTypeLabelDescription
include_scopes string repeated
Graph scopes that would to to be included. 
exclude_scopes string repeated
Graph scopes that would to to be excluded. 
bits int32 optional
Quantization bits options. Such as [4, 8, 16, 32]. Default: 4
int8_layers string repeated
Layer to be kept in INT8 when using INT4 quantization. 
per_channel CompressConfig.QuantHparams.PerChannel optional
Whether to use per-channel quant for specific op types. 
calibs int32 optional
Number of data batches for calibration. 
calib_path string optional
Path to save calibration scale file. 

CompressConfig.QuantHparams.PerChannel

Configuration message to set whether to use
per-channel quant for specific op types.
FieldTypeLabelDescription
Conv2D bool optional
Whether to use per-channel quant for Conv2D. Default: true
MatMul bool optional
Whether to use per-channel quant for MatMul. Default: false

easy_vision/python/protos/backbone.proto

Top


      
        

Backbone



        
          
FieldTypeLabelDescription
class_name string required
backbone class name, such as resnet_v1_50 
weight_decay float required
weight decay factor Default: 0.0005
batchnorm_trainable bool optional
if set False, batchnorm parameters and moving mean/std will not be update Default: true
output_stride int32 optional
currently only resnet backbone support this parameter
if output_stride is set greater than 0, when the product of each layer's stride
is equal to output_stride, the stride of the upper conv layers will be set to 1, 
and use dilation conv instead. Default: -1
global_pool bool optional
boolean flag to control the avgpooling before the
logits layer. If false or unset, pooling is done with a fixed window
that reduces default-sized inputs to 1x1, while larger inputs lead to
larger outputs. If true, any input size is pooled down to 1x1. Default: false
depth_multiplier float optional
depth_multiplier used only for mobilenet, which is used to adjust network for different
computation cost, please refer to https://arxiv.org/abs/1704.04861 Default: 1
use_true_shape bool optional
when image are padded, use_true_shape need to be set true,
then network will use true shape to global pool Default: true
use_fc bool optional
use fc or not, when the number of parameters are too large in fully connect layers,
use this parameter to drop fc layer in finetune cases. Default: true
norm_type NormType optional
normalization layer type Default: BATCH
connect_survival_prob float optional
block connect survival prob when training in efficientnet, default is 0.8 
dropout_keep_prob float optional
keep prob for dropout Default: 1
param UserDefinedParam repeated
user-defined args 

Block



        
          
FieldTypeLabelDescription
resnet_block ResnetBlock optional
 
fc_block FCBlock optional
 
block_func string optional
 
batchnorm_trainable bool optional
 Default: false

FCBlock



        
          
FieldTypeLabelDescription
fc_hyperparams Hyperparams required
 
depth int32 required
 Default: 1024
num_layers int32 required
 Default: 2

ResnetBlock



        
          
FieldTypeLabelDescription
class_name string required
model class name 
block_name string required
model block name, e.g. block4 
depth int32 optional
deprecated 
depth_bottleneck int32 optional
deprecated 
stride int32 optional
stride of the block Default: 1
unit_num int32 optional
deprecated 
weight_decay float optional
weight decay factor Default: 0

NormType


        
NameNumberDescription
NONE 1
BATCH 2
GROUP 3

easy_vision/python/protos/bipartite_matcher.proto

Top


      
        

BipartiteMatcher

Configuration proto for bipartite matcher. See
matchers/bipartite_matcher.py for details.
FieldTypeLabelDescription
use_matmul_gather bool optional
Force constructed match objects to use matrix multiplication based gather
instead of standard tf.gather Default: false

easy_vision/python/protos/box_coder.proto

Top


      
        

BoxCoder

Configuration proto for the box coder to be used in the object detection
pipeline. See core/box_coder.py for details.
FieldTypeLabelDescription
faster_rcnn_box_coder FasterRcnnBoxCoder optional
 
mean_stddev_box_coder MeanStddevBoxCoder optional
 
square_box_coder SquareBoxCoder optional
 
keypoint_box_coder KeypointBoxCoder optional
 
yolo_box_coder YOLOBoxCoder optional
 

FasterRcnnBoxCoder

Configuration proto for FasterRCNNBoxCoder. See
box_coders/faster_rcnn_box_coder.py for details.
FieldTypeLabelDescription
y_scale float optional
Scale factor for anchor encoded box center. Default: 10
x_scale float optional
 Default: 10
height_scale float optional
Scale factor for anchor encoded box height. Default: 5
width_scale float optional
Scale factor for anchor encoded box width. Default: 5

KeypointBoxCoder

Configuration proto for KeypointBoxCoder. See
box_coders/keypoint_box_coder.py for details.
FieldTypeLabelDescription
num_keypoints int32 optional
 
y_scale float optional
Scale factor for anchor encoded box center and keypoints. Default: 10
x_scale float optional
 Default: 10
height_scale float optional
Scale factor for anchor encoded box height. Default: 5
width_scale float optional
Scale factor for anchor encoded box width. Default: 5

MeanStddevBoxCoder

Configuration proto for MeanStddevBoxCoder. See
box_coders/mean_stddev_box_coder.py for details.
FieldTypeLabelDescription
stddev float optional
The standard deviation used to encode and decode boxes. Default: 0.01

SquareBoxCoder

Configuration proto for SquareBoxCoder. See
box_coders/square_box_coder.py for details.
FieldTypeLabelDescription
y_scale float optional
Scale factor for anchor encoded box center. Default: 10
x_scale float optional
 Default: 10
length_scale float optional
Scale factor for anchor encoded box length. Default: 5

YOLOBoxCoder

Configuration proto for YOLOBoxCoder. See
box_coders/yolo_box_coder.py for details.
FieldTypeLabelDescription
y_scale float optional
Scale factor for anchor encoded box center. Default: 1
x_scale float optional
 Default: 1
height_scale float optional
Scale factor for anchor encoded box height. Default: 1
width_scale float optional
Scale factor for anchor encoded box width. Default: 1

easy_vision/python/protos/box_predictor.proto

Top


      
        

BoxPredictor

Configuration proto for box predictor. See core/box_predictor.py for details.
FieldTypeLabelDescription
convolutional_box_predictor ConvolutionalBoxPredictor optional
 
mask_rcnn_box_predictor MaskRCNNBoxPredictor optional
 
rfcn_box_predictor RfcnBoxPredictor optional
 
weight_shared_convolutional_box_predictor WeightSharedConvolutionalBoxPredictor optional
 
convolutional_3d_box_predictor Convolutional3DBoxPredictor optional
 
mask_rcnn_3d_box_predictor MaskRCNN3DBoxPredictor optional
 
yolo_box_predictor YOLOBoxPredictor optional
 

Convolutional3DBoxPredictor

Configuration proto for Convolutional box predictor.
FieldTypeLabelDescription
conv_hyperparams Hyperparams optional
Hyperparameters for convolution ops used in the box predictor. 
min_depth int32 optional
Minimum feature map depth prior to predicting box encodings and class
predictions, used collaborately with Default: 0
max_depth int32 optional
Maximum feature depth map prior to predicting box encodings and class
predictions. If max_depth is set to 0, no additional feature map will be
inserted before location and class predictions. Default: 0
num_layers_before_predictor int32 optional
Number of the additional conv layers before the predictor. Default: 0
dropout_keep_probability float optional
Keep probability for dropout Default: 1
kernel_size int32 optional
Size of final convolution kernel. If the spatial resolution of the feature
map is smaller than the kernel size, then the kernel size is set to
min(feature_width, feature_height). Default: 1
box_code_size int32 optional
Size of the encoding for boxes. Default: 2
class_prediction_bias_init float optional
 Default: 0
use_depthwise bool optional
Whether to use depthwise separable convolution for box predictor layers. Default: false

ConvolutionalBoxPredictor

Configuration proto for Convolutional box predictor.
FieldTypeLabelDescription
conv_hyperparams Hyperparams optional
Hyperparameters for convolution ops used in the box predictor. 
min_depth int32 optional
Minimum feature map depth prior to predicting box encodings and class
predictions, used collaborately with Default: 0
max_depth int32 optional
Maximum feature depth map prior to predicting box encodings and class
predictions. If max_depth is set to 0, no additional feature map will be
inserted before location and class predictions. Default: 0
num_layers_before_predictor int32 optional
Number of the additional conv layers before the predictor. Default: 0
dropout_keep_probability float optional
Keep probability for dropout Default: 1
kernel_size int32 optional
Size of final convolution kernel. If the spatial resolution of the feature
map is smaller than the kernel size, then the kernel size is set to
min(feature_width, feature_height). Default: 1
box_code_size int32 optional
Size of the encoding for boxes. Default: 4
class_prediction_bias_init float optional
 Default: 0
use_depthwise bool optional
Whether to use depthwise separable convolution for box predictor layers. Default: false

MaskRCNN3DBoxPredictor



        
          
FieldTypeLabelDescription
fc_hyperparams Hyperparams optional
Hyperparameters for fully connected ops used in the box predictor. 
num_layers_before_predictor int32 optional
Number of the additional fc layers before the predictor. Default: 0
depth int32 optional
Output depth for the fc ops prior to predicting box encodings
and class predictions. Default: 0
dropout_keep_probability float optional
Keep probability for dropout. This is only used if use_dropout is true. Default: 1
box_code_size int32 optional
Size of the encoding for the boxes. Default: 2
agnostic bool optional
Whether to use one box for all classes rather than a different box for each
class. Default: true

MaskRCNNBoxPredictor



        
          
FieldTypeLabelDescription
fc_hyperparams Hyperparams optional
Hyperparameters for fully connected ops used in the box predictor. 
num_layers_before_predictor int32 optional
Number of the additional fc layers before the predictor. Default: 0
depth int32 optional
Output depth for the fc ops prior to predicting box encodings
and class predictions. Default: 0
dropout_keep_probability float optional
Keep probability for dropout. This is only used if use_dropout is true. Default: 1
box_code_size int32 optional
Size of the encoding for the boxes. Default: 4
agnostic bool optional
Whether to use one box for all classes rather than a different box for each
class. Default: true

RfcnBoxPredictor



        
          
FieldTypeLabelDescription
conv_hyperparams Hyperparams optional
Hyperparameters for convolution ops used in the box predictor. 
num_spatial_bins_height int32 optional
Bin sizes for RFCN crops. Default: 3
num_spatial_bins_width int32 optional
 Default: 3
depth int32 optional
Target depth to reduce the input image features to. Default: 1024
box_code_size int32 optional
Size of the encoding for the boxes. Default: 4
crop_height int32 optional
Size to resize the rfcn crops to. Default: 12
crop_width int32 optional
 Default: 12
agnostic bool optional
 Default: true

WeightSharedConvolutionalBoxPredictor

Configuration proto for weight shared convolutional box predictor.
FieldTypeLabelDescription
conv_hyperparams Hyperparams optional
Hyperparameters for convolution ops used in the box predictor. 
num_layers_before_predictor int32 optional
Number of the additional conv layers before the predictor. Default: 0
depth int32 optional
Output depth for the convolution ops prior to predicting box encodings
and class predictions. Default: 0
kernel_size int32 optional
Size of final convolution kernel. If the spatial resolution of the feature
map is smaller than the kernel size, then the kernel size is set to
min(feature_width, feature_height). Default: 3
box_code_size int32 optional
Size of the encoding for boxes. Default: 4
class_prediction_bias_init float optional
Bias initialization for class prediction. It has been show to stabilize
training where there are large number of negative boxes. See
https://arxiv.org/abs/1708.02002 for details. Default: 0
dropout_keep_probability float optional
Keep probability for dropout Default: 1
share_prediction_tower bool optional
Whether to share the multi-layer tower between box prediction and class
prediction heads. Default: true
use_depthwise bool optional
Whether to use depthwise separable convolution for box predictor layers. Default: false
box_encodings_clip_range WeightSharedConvolutionalBoxPredictor.BoxEncodingsClipRange optional
 
agnostic bool optional
 Default: true

WeightSharedConvolutionalBoxPredictor.BoxEncodingsClipRange

If specified, apply clipping to box encodings.
FieldTypeLabelDescription
min float optional
 
max float optional
 

YOLOBoxPredictor

Configuration proto for YOLO box predictor.
FieldTypeLabelDescription
conv_hyperparams Hyperparams optional
Hyperparameters for convolution ops used in the box predictor. 
num_layers_before_predictor int32 optional
Number of the additional conv layers before the predictor. Default: 1

easy_vision/python/protos/classification.proto

Top


      
        

ClassificationModel



        
          
FieldTypeLabelDescription
input_width int32 optional
input width height, if not set, will use default input size instead 
input_height int32 optional
 
backbone Backbone required
Backbone configuration 
num_classes int32 required
Number of classes 
loss ClassificationLoss required
Loss configuration for training 
add_summary bool optional
Whether to summary training related info Default: true
label_id_offset int32 optional
label_id offset, will be used to subtract from groundtruth class
when calcuating loss amd evaluation Default: 0
hidden_size int32 optional
hidden size of last fc, if assigned, original fc will be replaced Default: -1

LargeScaleClassificationModel



        
          
FieldTypeLabelDescription
backbone Backbone required
Backbone configuration 
input_layer string required
input layer name 
num_classes int32 required
Number of classes 
loss ClassificationLoss required
Loss configuration for training 
label_id_offset int32 optional
label_id offset, will be used to subtract from groundtruth class
when calcuating loss amd evaluation Default: 0
global_pool bool optional
global pooling after backbone Default: true
hidden_sizes int32 repeated
dense layer hidden size appended before logits 

easy_vision/python/protos/cv_model.proto

Top


      
        

CVModel



        
          
FieldTypeLabelDescription
model_class string required
 
simple_rpn SimpleRPN optional
 
faster_rcnn FasterRcnn optional
 
ssd Ssd optional
 
classification ClassificationModel optional
 
deeplab DeepLab optional
 
text_recognition TextRecognition optional
 
text_end2end TextEnd2End optional
 
text_krcnn TextKRCNN optional
 
video_classification VideoClassificationModel optional
 
rc3d RC3D optional
 
multi_label_classification MultiLabelClassification optional
 
large_scale_classification LargeScaleClassificationModel optional
 
text_rectification TextRectification optional
 
yolo YOLO optional
 
custom_model CustomModel optional
 
num_trainable_levels int32 optional
Number of model variables level to be trainable
set to 0 or -1: all variables are trainable Default: -1

CustomModel



        
          
FieldTypeLabelDescription
backbone Backbone optional
 
param UserDefinedParam repeated
 

easy_vision/python/protos/data_config.proto

Top


      
        

DataConfig



        
          
FieldTypeLabelDescription
separator string optional
separator between classname and description, for example:
family_name#QING, # is the separator 
default_class string optional
when there are no matched labels in class_map, 
and default_class is set, the label will be matched
default class 
error_class string repeated
all image of error_class, or has objects of error_class 
will be discarded 
ignore_class string repeated
all objects of ignore_class will be ignored
ignore_class is only used in detection tfrecord conversion 
class_map DataConfig.ClassMap repeated
specified a map from label to class_name
if class_name is not specified, class_name == label_name 
max_image_size uint32 optional
max height/width of all the images used in train
if not specified, its default value is 0,
images will not be resized to within max_image_size
these will result in slow training speed or even
OOM(out ouf memory) error Default: 0
image_format string required
 Default: jpg
model_type DataConfig.ModelType required
 Default: CLASSIFICATION
converter_class string required
converter class, some converter are already implemented, such
as QinceConverter,  user can also pass self-defined converter
using format like module.class_name Default: QinceConverter
proc_num uint32 required
number of generate process Default: 10
oss_config string optional
path to .osscredentials file 
param_key string repeated
custom key value parameters 
param_val string repeated
 
char_replace_map_path string optional
a csv contain two column ["original", "replaced"] for replace
special chars in text, such as complex chinese character to
simple chinese character, chinese punctuation to english punctuation etc. 
default_char_dict_path string optional
a txt contain a list of characters for the characters used in
model training, each character for one line.
if the default_char_dict_path is empty, output_char_dict
will infer from the input data 
prefetch_thread_num uint32 optional
number of parallel prefetch thread Default: 10
write_thread_num uint32 optional
number of parallel write thread Default: 1
part_record_num int32 optional
The number of samples in each parts of tfrecords:
If Mod(total_num, part_record_num) < part_record_num / 2:
  the rest samples are pad to the end of each tfrecords
  although the size is large than part_record_num;
If Mod(total_num, part_record_num) > part_record_num / 2:
  the rest samples will be placed into a new tfrecord Default: -1
test_ratio float optional
train/test dataset split ratio of test dataset Default: 0
max_test_image_size int32 optional
max image size of test dataset images, if not set, it will use max_image_size 
decode_type int32 optional
video decode parameters
decode type Default: 4
sample_fps int32 optional
sample rate, default -1, full sampling. Default: -1
reshape_width int32 optional
output size of decoded frames, -1 means no resize
if scalar is provided, height=width
otherwise output frame size is (heigh, width) Default: 112
reshape_height int32 optional
 Default: 112
decode_batch_size int32 optional
batch size of each decode phase. Default: 10
decode_keep_size int32 optional
left size of last decode phase. Default: 0
optical_flow string optional
flow calc algo
'' means not calculate optical flow
opencv means calculate optical flow using opencv
tvnet means calculate optical flow using tvnet 
min_bbox_size int32 optional
minimum bounding box size, bounding box size less than this value will be filtered Default: 5
user_defined_converter_path string optional
file path for self defined converter 
user_defined_generator_path string optional
file path for self-defined generator 
generator_class string optional
class name for generator 
exif_rotate bool optional
If false, do not rotate the image according to EXIF's orientation flag. Default: false
ignore_recog_class string repeated
all objects of ignore recog class will not be recognized in TextEnd2End model 
input_queue_size int32 optional
size of input queue, one thread read .csv
file and feed input data into input queue Default: 1048576
prefetch_queue_size int32 optional
size of prefetch queue, multithread thread prefetch
file data and feed into input queue Default: 1024
output_queue_size int32 optional
size of output queue, subproc write serialized
tf example into output queue, and main queue
acquire the data from output queue Default: 1024
roi_padding_min_ratio_of_short_edge float optional
padding cropped roi image with multiple of roi short edge Default: 0.75
roi_padding_max_ratio_of_short_edge float optional
padding cropped roi image with multiple of roi short edge Default: 0.75
task_id string optional
label task id for PaiConverter 

DataConfig.ClassMap

using this structure, we could map multiple marked
labels into one, for example: 'name'=>'text', 'address'=>'text'
thus enabling flexible collapse of labels
FieldTypeLabelDescription
label_name string required
marked class 
class_name string optional
the class used in tf record 

DataConfig.ModelType

task type classification, detection, segmentation
or even instance segmentation
NameNumberDescription
CLASSIFICATION 0
DETECTION 1
SEGMENTATION 2
INSTANCE_SEGMENTATION 3
TEXT_END2END 4
TEXT_RECOGNITION 5
TEXT_DETECTION 6
VIDEO_CLASSIFICATION 7
TEXT_RECTIFICATION 8
POLYGON_SEGMENTATION 9
SELF_DEFINED 100

easy_vision/python/protos/dataset.proto

Top


      
        

ActionDetectionDataDecoder



        
          
FieldTypeLabelDescription
label_map_path string optional
label map path: specifying the mapping from class_name to class_ids 

ClassificationDataDecoder



        
          
FieldTypeLabelDescription
label_map_path string optional
label map path: specifying the mapping from class_name to class_ids 
is_multi_label bool optional
 Default: false

CustomDataDecoder



        
          
FieldTypeLabelDescription
input_class string required
 
param UserDefinedParam repeated
 

DatasetConfig



        
          
FieldTypeLabelDescription
input_path string repeated
dataset input path, support pattern filename patterns
 using tf.match_files(input_path) 
batch_size uint32 optional
Effective batch size to use for training. Default: 32
data_augmentation_options PreprocessingStep repeated
Data augmentation options. 
shuffle bool optional
whether to shuffle data Default: true
shuffle_buffer_size uint32 optional
Buffer size to be used when shuffling. Default: 2048
filenames_shuffle_buffer_size uint32 optional
Buffer size to be used when shuffling file names. Default: 100
num_epochs uint32 optional
The number of times a data source is read. If set to zero, the data source
will be reused indefinitely. Default: 0
num_readers uint32 optional
Number of reader instances to create. Default: 1
read_block_length uint32 optional
Number of records to read from each reader at once. Default: 32
prefetch_size uint32 optional
Number of decoded records to prefetch before batching. Default: 512
num_parallel_map_calls uint32 optional
Number of parallel decode ops to apply. Default: 64
use_diff bool optional
whether to use difficult samples Default: true
shard bool optional
shard dataset to 1/num_workers in distribute mode Default: false
drop_remainder bool optional
whether the last batch should be dropped in the case it has
fewer than batch_size elements Default: true
bucket_sizes uint32 repeated
bucketing size for height and width of images, default is empty, no bucketing

specific settings to each of the dataset, such as voc
will extend to dataset in the future 
input_class string optional
input class name if want to direct use one input class 
voc_decoder_config VocDataDecoder optional
 
classification_decoder_config ClassificationDataDecoder optional
 
seg_decoder_config SegmentationDataDecoder optional
 
text_recognition_decoder_config TextRecognitionDataDecoder optional
 
text_end2end_decoder_config TextEnd2EndDataDecoder optional
 
text_detection_decoder_config TextDetectionDataDecoder optional
 
text_rectification_decoder_config TextRectificationDataDecoder optional
 
video_classification_decoder_config VideoClassificationDataDecoder optional
 
action_detection_decoder_config ActionDetectionDataDecoder optional
 
custom_decoder_config CustomDataDecoder optional
 

SegmentationDataDecoder



        

        
      
        

TextDetectionDataDecoder



        
          
FieldTypeLabelDescription
num_keypoints int32 optional
key points number Default: 4
label_map_path string required
label map path: specifying the mapping from class_name to class_ids 

TextEnd2EndDataDecoder



        
          
FieldTypeLabelDescription
char_dict_path string required
dict_path: specifying the char dict 
upper_case bool optional
transform label to upper case Default: false
num_keypoints int32 optional
key points number Default: 4
label_map_path string required
label map path: specifying the mapping from class_name to class_ids 

TextRecognitionDataDecoder



        
          
FieldTypeLabelDescription
char_dict_path string required
dict_path: specifying the char dict 
max_input_ratio float optional
specify the maximal width/height of all the training images 
min_input_ratio float optional
specify the minimal width/height of all the training images 
num_buckets int32 optional
put data into similar-length buckets 
upper_case bool optional
transform label to upper case Default: false
filter_long_image bool optional
filter image with aspect ratio > max_input_ratio Default: true
max_text_length float optional
specify the maximal text length of all the training images 

TextRectificationDataDecoder



        

        
      
        

VideoClassificationDataDecoder



        
          
FieldTypeLabelDescription
label_map_path string optional
label map path: specifying the mapping from class_name to class_ids 
input_modal string optional
load optical flow or rgb frame
'rgb', 'flow', 'rgb+flow' Default: rgb
is_multi_label bool optional
load multilabel data or not Default: false

VocDataDecoder



        
          
FieldTypeLabelDescription
label_map_path string required
label map path: specifying the mapping from class_name to class_ids 
load_instance_masks bool optional
Whether to load groundtruth instance masks. Default: false
mask_format MaskFormat optional
Type of instance mask. Default: NUMERICAL_MASK_FORMAT
num_keypoints uint32 optional
Number of groundtruth keypoints per object. Default: 0

MaskFormat


        
NameNumberDescription
NUMERICAL_MASK_FORMAT 1
[num_masks, H, W] float32 binary masks.
PNG_MASK_FORMAT 2
Encoded PNG masks.

easy_vision/python/protos/decoder.proto

Top


      
        

FullyConnectedCTCDecoder



        

        
      
        

RNNDecoderWithAttention



        
          
FieldTypeLabelDescription
embedding_size int32 optional
embedding size Default: 256
num_layers int32 optional
decoder depth Default: 2
basic_lstm BasicLSTM optional
 
gru GRU optional
 
layer_norm_basic_lstm LayerNormBasicLSTM optional
 
nas NAS optional
 
residual bool optional
whether to add residual connections Default: true
beam_width int32 optional
beam width when using beam search decoder. If 0 (default), use standard  decoder with greedy helper Default: 0
length_penalty_weight float optional
length penalty for beam search Default: 0
train_sampling_probability float optional
the probability of sampling from the outputs instead of reading directly from the inputs when training Default: 0
attention_mechanism string optional
attention mechanisms luong | scaled_luong | bahdanau | normed_bahdanau Default: normed_bahdanau
num_attention_heads int32 optional
number of attention heads Default: 1
output_attention bool optional
whether use attention as the cell output at each timestep Default: true
visualize_type string optional
Visualize attentions type or not. choice: line | spatial 
pass_hidden_state bool optional
whether to pass encoder's rnn state to decoder Default: true
attention_type string optional
attention type line | spatial Default: line

TransformerDecoder



        
          
FieldTypeLabelDescription
num_layers int32 required
number of encoder layers 
hidden_size int32 required
hidden units size 
num_heads int32 required
number of attention heads 
filter_size int32 required
hidden size of FeedForwardLayer 
layer_postprocess_dropout float optional
postprocess layer dropout Default: 0.1
attention_dropout float optional
attention layer dropout Default: 0.1
relu_dropout float optional
relu layer dropout Default: 0.1
beam_width int32 optional
beam search width Default: 1
length_penalty_weight float optional
length penalty for beam search Default: 0

easy_vision/python/protos/deeplab.proto

Top


      
        

DeepLab



        
          
FieldTypeLabelDescription
backbone Backbone required
 
aspp_input_layer string required
 
aspp_block ASPPBlock required
 
seg_decoder_head SegDecoderHead required
 

easy_vision/python/protos/eval.proto

Top


      
        

EvalConfig

Message for configuring DetectionModel evaluation jobs (eval.py).
FieldTypeLabelDescription
num_visualizations uint32 optional
Number of visualization images to generate. Default: 10
num_examples uint32 optional
Number of examples to process of evaluation. Default: 0
eval_interval_secs uint32 optional
How often to run evaluation. Default: 300
max_evals uint32 optional
Maximum number of times to run evaluation. If set to 0, will run forever. Default: 0
save_graph bool optional
Whether the TensorFlow graph used for evaluation should be saved to disk. Default: false
visualization_export_dir string optional
Path to directory to store visualizations in. If empty, visualization
images are not exported (only shown on Tensorboard). 
eval_master string optional
BNS name of the TensorFlow master. 
metrics_set string repeated
Type of metrics to use for evaluation.
possible values: 
  pascal_voc_detection_metrics
  pascal_voc07_detection_metrics
  coco_detection_metrics 
export_path string optional
Path to export detections to COCO compatible JSON format. 
ignore_groundtruth bool optional
Option to not read groundtruth labels and only export detections to
COCO-compatible JSON file. Default: false
use_moving_averages bool optional
Use exponential moving averages of variables for evaluation.
TODO(rathodv): When this is false make sure the model is constructed
without moving averages in restore_fn. Default: false
eval_instance_masks bool optional
Whether to evaluate instance masks.
Note that since there is no evaluation code currently for instance
segmenation this option is unused. Default: false
min_score_threshold float optional
Minimum score threshold for a detected object box to be visualized Default: 0.5
max_num_boxes_to_visualize int32 optional
Maximum number of detections to visualize Default: 20
skip_scores bool optional
When drawing a single detection, each label is by default visualized as
<label name> : <label score>. One can skip the name or/and score using the
following fields: Default: false
skip_labels bool optional
 Default: false
visualize_groundtruth_boxes bool optional
Whether to show groundtruth boxes in addition to detected boxes in
visualizations. Default: false
groundtruth_box_visualization_color string optional
Box color for visualizing groundtruth boxes. Default: black
keep_image_id_for_visualization_export bool optional
Whether to keep image identifier in filename when exported to
visualization_export_dir. Default: false
retain_original_images bool optional
Whether to retain original images (i.e. not pre-processed) in the tensor
dictionary, so that they can be displayed in Tensorboard. Default: true
include_metrics_per_category bool optional
If True, additionally include per-category metrics. Default: false
coco_analyze bool optional
If True, will open coco analyze function Default: false
matching_iou_threshold float optional
iou threshold used for evaluation Default: 0.5
include_metrics_per_dataset bool optional
If True, additionally include per-dataset metrics. Default: false
dataset_names string repeated
when include_metrics_per_dataset is true, eval dataset in this dataset_names 

easy_vision/python/protos/export.proto

Top


      
        

ExportConfig

Message for configuring exporting models.
FieldTypeLabelDescription
batch_size int32 optional
batch size used for exported model, -1 indicates batch_size is None
which is only supported by classification model right now, while 
other models support static batch_size Default: -1
exporter_type string optional
type of exporter [final | latest | none] when train_and_evaluation
final: performs a single export in the end of training
latest: regularly exports the serving graph and checkpoints
none: do not perform export Default: final
color_format string optional
type if color format [bgr | rbg] Default: rgb
export_video_preprocess bool optional
whether export preprocess graph Default: false
param UserDefinedParam repeated
custom defined parameters 

easy_vision/python/protos/faster_rcnn.proto

Top


      
        

FasterRcnn

Configuration for RegionProposal models, only objectness is predicted
multiclass is not supported
FieldTypeLabelDescription
backbone Backbone required
backbone config 
fpn FPN optional
 
rpn_head RPNHead required
rpn head config 
region_feature_extractor Block optional
block reuse part of backbone to extract box feature in second stage 
rcnn_head RCNNHead required
rcnn head config 
mrcnn_head MRCNNHead optional
rmask head config 

easy_vision/python/protos/fpn.proto

Top


      
        

FPN



        
          
FieldTypeLabelDescription
input string repeated
 
fea_dim int32 optional
 Default: 256
extra_conv_layers int32 optional
Param extra_conv_layers are used to extend feature maps beyond backbone,
 so that larger anchors(larger than 256) could be placed on more coarsed
 features(stride>=64).
When param retina_net is set to true, then will use strided convolution(s=2).
For fpn, extra_conv_layers = 1, which means that the fpn feature maps
 will be P2(C2) P3(C3) P4(C4) P5(C5) P6(P5 pooled).
C2, C3, C4, C5 are backbone feature maps of level 2, 3, 4, 5, such as
 resnet/block1, resnet/block2, resnet/block3, resnet/block4.
The anchors placed on PXs will be 32,64,128,256,512. Default: 0
retina_net bool optional
 Default: false
resize_method ResizeMethod.Enum optional
 Default: BILINEAR
roi_min_level int32 optional
level refers to feature map indices, level is associated with feature
map strides = 2 ^ level, usually:
  feature_map                level       stride
    conv1                      1            2
    conv2(resnet/block1)       2            4
    conv3(resnet/block2)       3            8
    conv4(resnet/block3)       4           16
    conv5(resnet/block4)       5           32
roi_min_level: refers to the roi level of lowest fpn feature map
  example: resnet-50/block1 => 2 Default: 2
roi_max_level int32 optional
roi_max_level: refers to the roi level of highest fpn feature map
  example: resnet-50/block4 => 5 Default: 5
roi_canonical_level int32 optional
roi_canonical_scale and roi_canonical_level specified the parameters
used in distribute proposals to feature maps:
    k = floor(k0 + log2(sqrt(wh)/224))
here, roi_canonical_scale = k0, roi_canonical_level = 224
see (https://arxiv.org/abs/1612.03144) for details. Default: 4
roi_canonical_scale int32 optional
 Default: 224
conv_hyperparams Hyperparams optional
Hyperparameters for convolution ops used in fpn. 

easy_vision/python/protos/graph_rewriter.proto

Top


      
        

GraphRewriter

Message to configure graph rewriter for the tf graph.
FieldTypeLabelDescription
quantization Quantization optional
 

Quantization

Message for quantization options. See
tensorflow/contrib/quantize/python/quantize.py for details.
FieldTypeLabelDescription
delay int32 optional
Number of steps to delay before quantization takes effect during training. Default: 500000
weight_bits int32 optional
Number of bits to use for quantizing weights.
Only 8 bit is supported for now. Default: 8
activation_bits int32 optional
Number of bits to use for quantizing activations.
Only 8 bit is supported for now. Default: 8

easy_vision/python/protos/hyperparams.proto

Top


      
        

BatchNorm

Configuration proto for batch norm to apply after convolution op. See
https://www.tensorflow.org/api_docs/python/tf/contrib/layers/batch_norm
FieldTypeLabelDescription
decay float optional
 Default: 0.999
center bool optional
 Default: true
scale bool optional
 Default: false
epsilon float optional
 Default: 0.001
train bool optional
Whether to train the batch norm variables. If this is set to false during
training, the current value of the batch_norm variables are used for
forward pass but they are never updated. Default: true

Hyperparams

Configuration proto for the convolution op hyperparameters to use in the
object detection pipeline.
FieldTypeLabelDescription
op Hyperparams.Op optional
 Default: CONV
regularizer Regularizer optional
Regularizer for the weights of the convolution op. 
initializer Initializer optional
Initializer for the weights of the convolution op. 
activation Hyperparams.Activation optional
 Default: RELU
batch_norm BatchNorm optional
BatchNorm hyperparameters. If this parameter is NOT set then BatchNorm is
not applied! 
regularize_depthwise bool optional
Whether depthwise convolutions should be regularized. If this parameter is
NOT set then the conv hyperparams will default to the parent scope. Default: false

Initializer

Proto with one-of field for initializers.
FieldTypeLabelDescription
truncated_normal_initializer TruncatedNormalInitializer optional
 
variance_scaling_initializer VarianceScalingInitializer optional
 
random_normal_initializer RandomNormalInitializer optional
 
xavier_initializer XavierInitializer optional
 

L1Regularizer

Configuration proto for L1 Regularizer.
See https://www.tensorflow.org/api_docs/python/tf/contrib/layers/l1_regularizer
FieldTypeLabelDescription
weight float optional
 Default: 1

L2Regularizer

Configuration proto for L2 Regularizer.
See https://www.tensorflow.org/api_docs/python/tf/contrib/layers/l2_regularizer
FieldTypeLabelDescription
weight float optional
 Default: 1

RandomNormalInitializer

Configuration proto for random normal initializer. See
https://www.tensorflow.org/api_docs/python/tf/random_normal_initializer
FieldTypeLabelDescription
mean float optional
 Default: 0
stddev float optional
 Default: 1

Regularizer

Proto with one-of field for regularizers.
FieldTypeLabelDescription
l1_regularizer L1Regularizer optional
 
l2_regularizer L2Regularizer optional
 

TruncatedNormalInitializer

Configuration proto for truncated normal initializer. See
https://www.tensorflow.org/api_docs/python/tf/truncated_normal_initializer
FieldTypeLabelDescription
mean float optional
 Default: 0
stddev float optional
 Default: 1

VarianceScalingInitializer

Configuration proto for variance scaling initializer. See
https://www.tensorflow.org/api_docs/python/tf/contrib/layers/
variance_scaling_initializer
FieldTypeLabelDescription
factor float optional
 Default: 2
uniform bool optional
 Default: false
mode VarianceScalingInitializer.Mode optional
 Default: FAN_IN

XavierInitializer



        
          
FieldTypeLabelDescription
uniform bool optional
 Default: true

Hyperparams.Activation

Type of activation to apply after convolution.
NameNumberDescription
NONE 0
Use None (no activation)
RELU 1
Use tf.nn.relu
RELU_6 2
Use tf.nn.relu6
LEAKY_RELU 3
Use leaky relu
MISH 4
Use mish

Hyperparams.Op

Operations affected by hyperparameters.
NameNumberDescription
CONV 1
Convolution, Separable Convolution, Convolution transpose.
FC 2
Fully connected

VarianceScalingInitializer.Mode


        
NameNumberDescription
FAN_IN 0
FAN_OUT 1
FAN_AVG 2

easy_vision/python/protos/keypoint_predictor.proto

Top


      
        

KeypointPredictor



        
          
FieldTypeLabelDescription
text_resnet_keypoint_predictor TextResnetKeypointPredictor optional
 

TextResnetKeypointPredictor



        
          
FieldTypeLabelDescription
conv_hyperparams Hyperparams optional
Hyperparameters for convolution ops used in the keypoint predictor. 
fc_hyperparams Hyperparams optional
Hyperparameters for fc ops used in the keypoint predictor. 
num_blocks_before_predictor int32 optional
Number of resnet block before keypoint predictor, we use down sampling between block Default: 1
num_units_per_block int32 optional
Number resnet units per resnet block Default: 1
base_depth_before_predictor int32 optional
The depth of first resnet block Default: 256
se_rate int32 optional
The rate of squeeze_and_excitation, less and equal than zeros for disable Default: 0
keypoint_prediction_num_fc_layers int32 optional
The number of fc layers before predictor Default: 2
keypoint_prediction_fc_depth int32 optional
The depth of fc layers Default: 1024

easy_vision/python/protos/losses.proto

Top


      
        

BootstrappedSigmoidClassificationLoss

Classification loss using a sigmoid function over the class prediction with
the highest prediction score.
FieldTypeLabelDescription
alpha float optional
Interpolation weight between 0 and 1. 
hard_bootstrap bool optional
Whether hard boot strapping should be used or not. If true, will only use
one class favored by model. Othewise, will use all predicted class
probabilities. Default: false
anchorwise_output bool optional
DEPRECATED, do not use.
Output loss per anchor. Default: false

ClassificationLoss

Configuration for class prediction loss function.
FieldTypeLabelDescription
weighted_sigmoid WeightedSigmoidClassificationLoss optional
 
weighted_softmax WeightedSoftmaxClassificationLoss optional
 
weighted_logits_softmax WeightedSoftmaxClassificationAgainstLogitsLoss optional
 
bootstrapped_sigmoid BootstrappedSigmoidClassificationLoss optional
 
weighted_sigmoid_focal SigmoidFocalClassificationLoss optional
 

HardExampleMiner

Configuration for hard example miner.
FieldTypeLabelDescription
num_hard_examples int32 optional
Maximum number of hard examples to be selected per image (prior to
enforcing max negative to positive ratio constraint).  If set to 0,
all examples obtained after NMS are considered. Default: 64
iou_threshold float optional
Minimum intersection over union for an example to be discarded during NMS. Default: 0.7
loss_type HardExampleMiner.LossType optional
 Default: BOTH
max_negatives_per_positive int32 optional
Maximum number of negatives to retain for each positive anchor. If
num_negatives_per_positive is 0 no prespecified negative:positive ratio is
enforced. Default: 0
min_negatives_per_image int32 optional
Minimum number of negative anchors to sample for a given image. Setting
this to a positive number samples negatives in an image without any
positive anchors and thus not bias the model towards having at least one
detection per image. Default: 0

LocalizationLoss

Configuration for bounding box localization loss function.
FieldTypeLabelDescription
weighted_l2 WeightedL2LocalizationLoss optional
 
weighted_smooth_l1 WeightedSmoothL1LocalizationLoss optional
 
weighted_iou WeightedIOULocalizationLoss optional
 

Loss

Message for configuring the localization loss, classification loss and hard
example miner used for training object detection models. See core/losses.py
for details
FieldTypeLabelDescription
localization_loss LocalizationLoss optional
Localization loss to use. 
classification_loss ClassificationLoss optional
Classification loss to use. 
hard_example_miner HardExampleMiner optional
If not left to default, applies hard example mining. 
classification_weight float optional
Classification loss weight. Default: 1
localization_weight float optional
Localization loss weight. Default: 1
random_example_sampler RandomExampleSampler optional
If not left to default, applies random example sampling. 

RandomExampleSampler

Configuration for random example sampler.
FieldTypeLabelDescription
positive_sample_fraction float optional
The desired fraction of positive samples in batch when applying random
example sampling. Default: 0.01

SigmoidFocalClassificationLoss

Sigmoid Focal cross entropy loss as described in
https://arxiv.org/abs/1708.02002
FieldTypeLabelDescription
anchorwise_output bool optional
DEPRECATED, do not use. Default: false
gamma float optional
modulating factor for the loss. Default: 2
alpha float optional
alpha weighting factor for the loss. 
label_smoothing float optional
use label smoothing in loss
please refer to label_smoothing explanation in tf.losses.sigmoid_cross_entropy Default: 0

WeightedIOULocalizationLoss

Intersection over union location loss: 1 - IOU
FieldTypeLabelDescription
mode string optional
iou type [iou/giou/diou/ciou] Default: iou

WeightedL2LocalizationLoss

L2 location loss: 0.5 * ||weight * (a - b)|| ^ 2
FieldTypeLabelDescription
anchorwise_output bool optional
DEPRECATED, do not use.
Output loss per anchor. Default: false

WeightedSigmoidClassificationLoss

Classification loss using a sigmoid function over class predictions.
FieldTypeLabelDescription
anchorwise_output bool optional
DEPRECATED, do not use.
Output loss per anchor. Default: false
label_smoothing float optional
use label smoothing in loss
please refer to label_smoothing explanation in tf.losses.sigmoid_cross_entropy Default: 0

WeightedSmoothL1LocalizationLoss

SmoothL1 (Huber) location loss.
The smooth L1_loss is defined elementwise as .5 x^2 if |x| <= delta and
0.5 x^2 + delta * (|x|-delta) otherwise, where x is the difference between
predictions and target.
FieldTypeLabelDescription
anchorwise_output bool optional
DEPRECATED, do not use.
Output loss per anchor. Default: false
delta float optional
Delta value for huber loss. Default: 1

WeightedSoftmaxClassificationAgainstLogitsLoss

Classification loss using a softmax function over class predictions and
a softmax function over the groundtruth labels (assumed to be logits).
FieldTypeLabelDescription
anchorwise_output bool optional
DEPRECATED, do not use. Default: false
logit_scale float optional
Scale and softmax groundtruth logits before calculating softmax
classification loss. Typically used for softmax distillation with teacher
annotations stored as logits. Default: 1

WeightedSoftmaxClassificationLoss

Classification loss using a softmax function over class predictions.
FieldTypeLabelDescription
anchorwise_output bool optional
DEPRECATED, do not use.
Output loss per anchor. Default: false
logit_scale float optional
Scale logit (input) value before calculating softmax classification loss.
Typically used for softmax distillation. Default: 1
label_smoothing float optional
use label smoothing in loss
please refer to label_smoothing explanation in tf.losses.sigmoid_cross_entropy Default: 0

HardExampleMiner.LossType

Whether to use classification losses ('cls', default), localization losses
('loc') or both losses ('both'). In the case of 'both', cls_loss_weight and
loc_loss_weight are used to compute weighted sum of the two losses.
NameNumberDescription
BOTH 0
CLASSIFICATION 1
LOCALIZATION 2

easy_vision/python/protos/mask_predictor.proto

Top


      
        

MaskPredictor



        
          
FieldTypeLabelDescription
mask_rcnn_mask_predictor MaskRCNNMaskPredictor optional
 

MaskRCNNMaskPredictor



        
          
FieldTypeLabelDescription
conv_hyperparams Hyperparams optional
Hyperparameters for convolution ops used in the box predictor. 
mask_prediction_conv_depth int32 optional
The depth for the first conv2d_transpose op applied to the
image_features in the mask prediction branch. If set to 0, the value
will be set automatically based on the number of channels in the image
features and the number of classes. Default: 256
mask_height int32 optional
The height and the width of the predicted mask. Default: 15
mask_width int32 optional
 Default: 15
mask_prediction_num_conv_layers int32 optional
The number of convolutions applied to image_features in the mask prediction
branch. Default: 2
masks_are_class_agnostic bool optional
 Default: false
convolve_then_upsample_masks bool optional
Whether to apply convolutions on mask features before upsampling using
nearest neighbor resizing.
By default, mask features are resized to [`mask_height`, `mask_width`]
before applying convolutions and predicting masks. Default: false

easy_vision/python/protos/matcher.proto

Top


      
        

Matcher

Configuration proto for the matcher to be used in the object detection
pipeline. See core/matcher.py for details.
FieldTypeLabelDescription
argmax_matcher ArgMaxMatcher optional
 
bipartite_matcher BipartiteMatcher optional
 
yolo_argmax_matcher YOLOArgMaxMatcher optional
 

easy_vision/python/protos/multi_label_classification.proto

Top


      
        

MultiLabelClassification



        
          
FieldTypeLabelDescription
backbone Backbone required
Backbone configuration 
multi_label_classification_head MultiLabelClassificationHead required
multi-label classification head 
include_metrics_per_category bool optional
whether display class-specific evaluation metric Default: false

MultiLabelClassificationHead



        
          
FieldTypeLabelDescription
input_layer string repeated
input layer 
num_classes int32 required
Number of classes 
multi_label_loss_weight float optional
loss weight Default: 1
global_pooling_type string optional
global pooling type, max for max_pooling, avg for average pooling Default: max
hidden_sizes int32 repeated
extra conv layer hidden size 
loss ClassificationLoss optional
classification loss 

easy_vision/python/protos/optimizer.proto

Top


      
        

AdamOptimizer

Configuration message for the AdamOptimizer
See: https://www.tensorflow.org/api_docs/python/tf/train/AdamOptimizer
FieldTypeLabelDescription
learning_rate LearningRate optional
 
beta1 float optional
 Default: 0.9
beta2 float optional
 Default: 0.999

ConstantLearningRate

Configuration message for a constant learning rate.
FieldTypeLabelDescription
learning_rate float optional
 Default: 0.002

CosineDecayLearningRate

Configuration message for a cosine decaying learning rate as defined in
utils/learning_schedules.py
FieldTypeLabelDescription
learning_rate_base float optional
 Default: 0.002
total_steps uint32 optional
 Default: 4000000
warmup_learning_rate float optional
 Default: 0.0002
warmup_steps uint32 optional
 Default: 10000
hold_base_rate_steps uint32 optional
 Default: 0

ExponentialDecayLearningRate

Configuration message for an exponentially decaying learning rate.
See https://www.tensorflow.org/versions/master/api_docs/python/train/ \
decaying_the_learning_rate#exponential_decay
FieldTypeLabelDescription
initial_learning_rate float optional
 Default: 0.002
decay_steps uint32 optional
 Default: 4000000
decay_factor float optional
 Default: 0.95
staircase bool optional
 Default: true
burnin_learning_rate float optional
 Default: 0
burnin_steps uint32 optional
 Default: 0
min_learning_rate float optional
 Default: 0

LearningRate

Configuration message for optimizer learning rate.
FieldTypeLabelDescription
constant_learning_rate ConstantLearningRate optional
 
exponential_decay_learning_rate ExponentialDecayLearningRate optional
 
manual_step_learning_rate ManualStepLearningRate optional
 
cosine_decay_learning_rate CosineDecayLearningRate optional
 
poly_decay_learning_rate PolyDecayLearningRate optional
 
transformer_learning_rate TransformerLearningRate optional
 

ManualStepLearningRate

Configuration message for a manually defined learning rate schedule.
FieldTypeLabelDescription
initial_learning_rate float optional
 Default: 0.002
schedule ManualStepLearningRate.LearningRateSchedule repeated
 
warmup bool optional
Whether to linearly interpolate learning rates for steps in
[0, schedule[0].step]. Default: false

ManualStepLearningRate.LearningRateSchedule



        
          
FieldTypeLabelDescription
step uint32 optional
 
learning_rate float optional
 Default: 0.002

MomentumOptimizer

Configuration message for the MomentumOptimizer
See: https://www.tensorflow.org/api_docs/python/tf/train/MomentumOptimizer
FieldTypeLabelDescription
learning_rate LearningRate optional
 
momentum_optimizer_value float optional
 Default: 0.9

Optimizer

Top level optimizer message.
FieldTypeLabelDescription
rms_prop_optimizer RMSPropOptimizer optional
 
momentum_optimizer MomentumOptimizer optional
 
adam_optimizer AdamOptimizer optional
 
use_moving_average bool optional
 Default: false
moving_average_decay float optional
 Default: 0.9999

PolyDecayLearningRate

Configuration message for a poly decaying learning rate.
See https://www.tensorflow.org/api_docs/python/tf/train/polynomial_decay.
FieldTypeLabelDescription
learning_rate_base float required
 
total_steps int64 required
 
power float required
 
end_learning_rate float optional
 Default: 0

RMSPropOptimizer

Configuration message for the RMSPropOptimizer
See: https://www.tensorflow.org/api_docs/python/tf/train/RMSPropOptimizer
FieldTypeLabelDescription
learning_rate LearningRate optional
 
momentum_optimizer_value float optional
 Default: 0.9
decay float optional
 Default: 0.9
epsilon float optional
 Default: 1

TransformerLearningRate



        
          
FieldTypeLabelDescription
learning_rate_base float required
 
hidden_size int32 required
 
warmup_steps int32 required
 
step_scaling_rate float optional
 Default: 1

easy_vision/python/protos/param_space.proto

Top


      
        

ParamSpace



        
          
FieldTypeLabelDescription
task_type ParamSpace.TaskType required
task type, such as CLASSIFICATION | DETECTION etc. Default: CLASSIFICATION
data_prefixs string repeated
dataset data directory example
+ data/
   + tfrecord/
       - name_train_1.tfrecord
       - name_train_1_info.json
       - name_test.tfrecord
       - name_label_map.pbtxt
       - name_char_dict
train data path prefix, e.g. data/tfrecord/name 
preference_type ParamSpace.PreferenceType optional
param space preference Default: ACCURATE
pretrained_model_dir string optional
pretrain model directory for incremental training 
space_size int32 optional
param space size Default: 1

ParamSpace.PreferenceType


        
NameNumberDescription
ACCURATE 0
FAST 1

ParamSpace.TaskType


        
NameNumberDescription
CLASSIFICATION 0
DETECTION 1
SEGMENTATION 2
INSTANCE_SEGMENTATION 3
TEXT_END2END 4
TEXT_RECOGNITION 5
TEXT_DETECTION 6

easy_vision/python/protos/pipeline.proto

Top


      
        

CVEstimator

CVEstimator config:including train and test parameters
FieldTypeLabelDescription
train_config TrainConfig optional
train config, including optimizer, weight decay, num_steps and so on 
eval_config EvalConfig optional
 
export_config ExportConfig optional
 
train_data DatasetConfig optional
 
eval_data DatasetConfig optional
 
model_config CVModel required
cv model config 
user_resource_path string optional
in local mode user_resource_path should be
set to the directory containing customized code 
ac_config CompressConfig optional
auto-compression options. 

easy_vision/python/protos/post_processing.proto

Top


      
        

BatchNonMaxSuppression

Configuration proto for non-max-suppression operation on a batch of
detections.
FieldTypeLabelDescription
score_threshold float optional
Scalar threshold for score used in evaluation (low scoring boxes are removed). Default: 0
predict_score_threshold float optional
workaround setup for score_threshold used for prediction
to avoid revising config file before exporting models Default: 0.5
iou_threshold float optional
Scalar threshold for IOU (boxes that have high IOU overlap
with previously selected boxes are removed). Default: 0.6
max_detections_per_class int32 optional
Maximum number of detections to retain per class. Default: 100
max_total_detections int32 optional
Maximum number of detections to retain across all classes. Default: 100
class_agnostic bool optional
Class agnostic set in nms. Default: false

PostProcessing

Configuration proto for post-processing predicted boxes and
scores.
FieldTypeLabelDescription
batch_non_max_suppression BatchNonMaxSuppression optional
Non max suppression parameters. 
score_converter PostProcessing.ScoreConverter optional
Score converter to use. Default: IDENTITY
logit_scale float optional
Scale logit (input) value before conversion in post-processing step.
Typically used for softmax distillation, though can be used to scale for
other reasons. Default: 1

PostProcessing.ScoreConverter

Enum to specify how to convert the detection scores.
NameNumberDescription
IDENTITY 0
Input scores equals output scores.
SIGMOID 1
Applies a sigmoid on input scores.
SOFTMAX 2
Applies a softmax on input scores

easy_vision/python/protos/predictor_eval.proto

Top


      
        

PredictorEval

PredictorEval config: including train and test parameters
FieldTypeLabelDescription
predictor_name string required
name of predictor or predictor class path 
model_path string optional
predictor model path 
eval_config EvalConfig required
evaluator config 
eval_data DatasetConfig required
evaluation data config 

easy_vision/python/protos/preprocessor.proto

Top


      
        

ActionDetectionPreprocessing



        
          
FieldTypeLabelDescription
length int32 optional
video length Default: 768
crop_size int32 optional
video input size Default: 112
frame_height int32 optional
resize height and width Default: 128
frame_width int32 optional
 Default: 171
means float repeated
means 
norm_values float repeated
 
is_flip bool optional
flip and random crop indicator Default: true
is_random_crop bool optional
 Default: true

CifarNetPreprocessing



        
          
FieldTypeLabelDescription
output_width int32 optional
 Default: 32
output_height int32 optional
 Default: 32
is_training bool optional
 Default: false
add_image_summaries bool optional
 Default: false

ClassificationAutoAugment

Distort classification image with Auto Augment

ClassificationCentralCrop

Central crop image
FieldTypeLabelDescription
central_crop_fraction float optional
central crop fraction Default: 0.875

ClassificationRandomAugment

Distort classification image with Random Augment
FieldTypeLabelDescription
num_layers int32 optional
the number of augmentation transformations to apply
sequentially to an image Default: 2
magnitude int32 optional
shared magnitude across all augmentation operations Default: 10

ClassificationRandomCrop

Randomly crops the image
FieldTypeLabelDescription
min_aspect_ratio float optional
Aspect ratio bounds of cropped image. Default: 0.75
max_aspect_ratio float optional
 Default: 1.33
min_area float optional
Allowed area ratio of cropped image to original image. Default: 0.1
max_area float optional
 Default: 1

DeepLabRandomCrop



        
          
FieldTypeLabelDescription
crop_size int32 optional
 Default: 513

DeepLabRandomHorizontalFlip



        

        
      
        

DeepLabResizeImage



        
          
FieldTypeLabelDescription
new_height int32 required
 
new_width int32 required
 

EfficientNetPreprocessing



        
          
FieldTypeLabelDescription
model_name string optional
if model_name is set, output_width and output_width will use default for
this model, e.g., efficientnet-b0: 224, efficientnet-b1: 240 
output_width int32 optional
 Default: 224
output_height int32 optional
 Default: 224
is_training bool optional
 Default: false
augment_name string optional
the name of the augmentation method to apply to the image.
`autoaugment` if AutoAugment is to be used
`randaugment` if RandAugment is to be used Default: randaugment
randaug_num_layers int32 optional
the number of augmentation transformations to apply
sequentially to an image in randaugment Default: 2
randaug_magnitude int32 optional
shared magnitude across all augmentation operations in randaugment Default: 10

InceptionPreprocessing



        
          
FieldTypeLabelDescription
output_width int32 optional
 Default: 224
output_height int32 optional
 Default: 224
is_training bool optional
 Default: false
add_image_summaries bool optional
 Default: false
central_crop_fraction float optional
 Default: 0.875

KineticsPreprocessing

kinetics preprocess
FieldTypeLabelDescription
sample_duration int32 optional
 Default: 16
input_c int32 optional
 Default: 3
initial_scale float optional
scale parameters Default: 1
n_scales int32 optional
 Default: 5
scale_step float optional
 Default: 0.840896428
train_crop string optional
spatial crop type Default: corner
sample_size int32 optional
sample size Default: 112
n_samples_for_each_video int32 optional
 Default: 1
is_spatial_transform bool optional
 Default: true
is_training bool optional
 Default: true

LeNetPreprocessing



        
          
FieldTypeLabelDescription
output_width int32 optional
 Default: 28
output_height int32 optional
 Default: 28
is_training bool optional
 Default: false

LetterBoxImage

Padding the short edge of image to fit the target image aspect_ratio
FieldTypeLabelDescription
aspect_ratio float optional
target image aspect ratio 
pad_value float optional
constant value to pad 

NormalizeImage

Normalizes pixel values in an image.
For every channel in the image, moves the pixel values from the range
[original_minval, original_maxval] to [target_minval, target_maxval].
FieldTypeLabelDescription
original_minval float optional
 
original_maxval float optional
 
target_minval float optional
 Default: 0
target_maxval float optional
 Default: 1

PreprocessingStep

Message for defining a preprocessing operation on input data.
See: //third_party/tensorflow_models/core/preprocessor.py
FieldTypeLabelDescription
normalize_image NormalizeImage optional
 
random_horizontal_flip RandomHorizontalFlip optional
 
random_pixel_value_scale RandomPixelValueScale optional
 
random_image_scale RandomImageScale optional
 
random_rgb_to_gray RandomRGBtoGray optional
 
random_adjust_brightness RandomAdjustBrightness optional
 
random_adjust_contrast RandomAdjustContrast optional
 
random_adjust_hue RandomAdjustHue optional
 
random_adjust_saturation RandomAdjustSaturation optional
 
random_distort_color RandomDistortColor optional
 
random_jitter_boxes RandomJitterBoxes optional
 
random_crop_image RandomCropImage optional
 
random_pad_image RandomPadImage optional
 
random_crop_pad_image RandomCropPadImage optional
 
random_crop_to_aspect_ratio RandomCropToAspectRatio optional
 
random_black_patches RandomBlackPatches optional
 
random_resize_method RandomResizeMethod optional
 
scale_boxes_to_pixel_coordinates ScaleBoxesToPixelCoordinates optional
 
resize_image ResizeImage optional
 
resize_to_range ResizeToRange optional
 
random_resize_to_range RandomResizeToRange optional
 
subtract_channel_mean SubtractChannelMean optional
 
ssd_random_crop SSDRandomCrop optional
 
ssd_random_crop_pad SSDRandomCropPad optional
 
ssd_random_crop_fixed_aspect_ratio SSDRandomCropFixedAspectRatio optional
 
ssd_random_crop_pad_fixed_aspect_ratio SSDRandomCropPadFixedAspectRatio optional
 
random_vertical_flip RandomVerticalFlip optional
 
random_rotation90 RandomRotation90 optional
 
rgb_to_gray RGBtoGray optional
 
letter_box_image LetterBoxImage optional
 
random_resize_image RandomResizeImage optional
 
vgg_preprocessing VggPreprocessing optional
 
inception_preprocessing InceptionPreprocessing optional
 
lenet_preprocessing LeNetPreprocessing optional
 
cifarnet_preprocessing CifarNetPreprocessing optional
 
efficientnet_preprocessing EfficientNetPreprocessing optional
 
deeplab_random_crop DeepLabRandomCrop optional
 
deeplab_random_horizontal_flip DeepLabRandomHorizontalFlip optional
 
deeplab_resize_image DeepLabResizeImage optional
 
classification_random_crop ClassificationRandomCrop optional
 
classification_central_crop ClassificationCentralCrop optional
 
classification_auto_augment ClassificationAutoAugment optional
 
classification_random_augment ClassificationRandomAugment optional
 
resize_image_with_fixed_height ResizeImageWithFixedHeight optional
 
random_rotation RandomRotation optional
 
random_jitter_aspect_ratio RandomJitterAspectRatio optional
 
random_crop_text_image RandomCropTextImage optional
 
random_crop_text_region RandomCropTextRegion optional
 
random_rotate_text_region RandomRotateTextRegion optional
 
temporal_random_crop TemporalRandomCrop optional
 
temporal_center_crop TemporalCenterCrop optional
 
kinetics_preprocessing KineticsPreprocessing optional
 
action_detection_preprocessing ActionDetectionPreprocessing optional
 
video_spatial_random_crop VideoSpatialRandomCrop optional
 
video_spatial_center_crop VideoSpatialCenterCrop optional
 

RGBtoGray

Converts the RGB image to a grayscale image. This also converts the image
depth from 3 to 1, unlike RandomRGBtoGray which does not change the image
depth.

RandomAdjustBrightness

Randomly changes image brightness by up to max_delta. Image outputs will be
saturated between 0 and 1.
FieldTypeLabelDescription
max_delta float optional
 Default: 0.2

RandomAdjustContrast

Randomly scales contract by a value between [min_delta, max_delta].
FieldTypeLabelDescription
min_delta float optional
 Default: 0.8
max_delta float optional
 Default: 1.25

RandomAdjustHue

Randomly alters hue by a value of up to max_delta.
FieldTypeLabelDescription
max_delta float optional
 Default: 0.02

RandomAdjustSaturation

Randomly changes saturation by a value between [min_delta, max_delta].
FieldTypeLabelDescription
min_delta float optional
 Default: 0.8
max_delta float optional
 Default: 1.25

RandomBlackPatches

Randomly adds black square patches to an image.
FieldTypeLabelDescription
max_black_patches int32 optional
The maximum number of black patches to add. Default: 10
probability float optional
The probability of a black patch being added to an image. Default: 0.5
size_to_image_ratio float optional
Ratio between the dimension of the black patch to the minimum dimension of
the image (patch_width = patch_height = min(image_height, image_width)). Default: 0.1

RandomCropImage

Randomly crops the image and bounding boxes.
FieldTypeLabelDescription
min_object_covered float optional
Cropped image must cover at least one box by this fraction. Default: 1
min_aspect_ratio float optional
Aspect ratio bounds of cropped image. Default: 0.75
max_aspect_ratio float optional
 Default: 1.33
min_area float optional
Allowed area ratio of cropped image to original image. Default: 0.1
max_area float optional
 Default: 1
overlap_thresh float optional
Minimum overlap threshold of cropped boxes to keep in new image. If the
ratio between a cropped bounding box and the original is less than this
value, it is removed from the new image. Default: 0.3
random_coef float optional
Probability of keeping the original image. Default: 0

RandomCropPadImage

Randomly crops an image followed by a random pad.
FieldTypeLabelDescription
min_object_covered float optional
Cropping operation must cover at least one box by this fraction. Default: 1
min_aspect_ratio float optional
Aspect ratio bounds of image after cropping operation. Default: 0.75
max_aspect_ratio float optional
 Default: 1.33
min_area float optional
Allowed area ratio of image after cropping operation. Default: 0.1
max_area float optional
 Default: 1
overlap_thresh float optional
Minimum overlap threshold of cropped boxes to keep in new image. If the
ratio between a cropped bounding box and the original is less than this
value, it is removed from the new image. Default: 0.3
random_coef float optional
Probability of keeping the original image during the crop operation. Default: 0
min_padded_size_ratio float repeated
Maximum dimensions for padded image. If unset, will use double the original
image dimension as a lower bound. Both of the following fields should be
length 2. 
max_padded_size_ratio float repeated
 
pad_color float repeated
Color of the padding. If unset, will pad using average color of the input
image. This field should be of length 3. 

RandomCropTextImage

Randomly crops the text image and bounding boxes.
FieldTypeLabelDescription
min_object_covered float optional
Cropped image must cover at least one box by this fraction. Default: 1
min_aspect_ratio float optional
Aspect ratio bounds of cropped image. Default: 0.2
max_aspect_ratio float optional
 Default: 5
min_area float optional
Allowed area ratio of cropped image to original image. Default: 0.1
max_area float optional
 Default: 1
random_coef float optional
Probability of keeping the original image. Default: 0.1

RandomCropTextRegion

Randomly crops the text region
text recognition and text rectification use

RandomCropToAspectRatio

Randomly crops an iamge to a given aspect ratio.
FieldTypeLabelDescription
aspect_ratio float optional
Aspect ratio. Default: 1
overlap_thresh float optional
Minimum overlap threshold of cropped boxes to keep in new image. If the
ratio between a cropped bounding box and the original is less than this
value, it is removed from the new image. Default: 0.3

RandomDistortColor

Performs a random color distortion. color_orderings should either be 0 or 1.
FieldTypeLabelDescription
color_ordering int32 optional
0 means first adjust brightness then adjust saturation, 1 otherwise Default: 0
fast_mode bool optional
in fast_mode, only adjust brightness and saturation
otherwise, adjust brightness, saturation, hue, contrast Default: false

RandomHorizontalFlip

Randomly horizontally flips the image and detections 50% of the time.
FieldTypeLabelDescription
keypoint_flip_permutation int32 repeated
Specifies a mapping from the original keypoint indices to horizontally
flipped indices. This is used in the event that keypoints are specified,
in which case when the image is horizontally flipped the keypoints will
need to be permuted. E.g. for keypoints representing left_eye, right_eye,
nose_tip, mouth, left_ear, right_ear (in that order), one might specify
the keypoint_flip_permutation below:
keypoint_flip_permutation: 1
keypoint_flip_permutation: 0
keypoint_flip_permutation: 2
keypoint_flip_permutation: 3
keypoint_flip_permutation: 5
keypoint_flip_permutation: 4 

RandomImageScale

Randomly enlarges or shrinks image (keeping aspect ratio).
FieldTypeLabelDescription
min_scale_ratio float optional
 Default: 0.5
max_scale_ratio float optional
 Default: 2

RandomJitterAspectRatio

Random Change Image Aspect Ratio
FieldTypeLabelDescription
min_jitter_coef float optional
 Default: 0.8
max_jitter_coef float optional
 Default: 1.2
method RandomJitterAspectRatio.Method optional
 Default: BILINEAR

RandomJitterBoxes

Randomly jitters corners of boxes in the image determined by ratio.
ie. If a box is [100, 200] and ratio is 0.02, the corners can move by [1, 4].
FieldTypeLabelDescription
ratio float optional
 Default: 0.05

RandomPadImage

Randomly adds padding to the image.
FieldTypeLabelDescription
min_height_ratio float optional
Minimum dimensions for padded image. If unset, will use original image
dimension as a lower bound. 
min_width_ratio float optional
 
max_height_ratio float optional
Maximum dimensions for padded image. If unset, will use double the original
image dimension as a lower bound. 
max_width_ratio float optional
 
pad_color float repeated
Color of the padding. If unset, will pad using average color of the input
image. 

RandomPixelValueScale

Randomly scales the values of all pixels in the image by some constant value
between [minval, maxval], then clip the value to a range between [0, 1.0].
FieldTypeLabelDescription
minval float optional
 Default: 0.9
maxval float optional
 Default: 1.1

RandomRGBtoGray

Randomly convert entire image to grey scale.
FieldTypeLabelDescription
probability float optional
 Default: 0.1

RandomResizeImage

Random Resize images
FieldTypeLabelDescription
new_heights int32 repeated
 
new_widths int32 repeated
 
method ResizeMethod.Enum optional
 Default: BILINEAR

RandomResizeMethod

Randomly resizes the image up to [target_height, target_width].
FieldTypeLabelDescription
target_height float optional
 
target_width float optional
 

RandomResizeToRange



        
          
FieldTypeLabelDescription
min_sizes int32 repeated
 
max_sizes int32 repeated
 
method ResizeMethod.Enum optional
 Default: BILINEAR

RandomRotateTextRegion

Randomly rotates the text region image counter-clockwise.
FieldTypeLabelDescription
min_angle float optional
 Default: -10
max_angle float optional
 Default: 10
rot90 bool optional
random rotate image 90 degree or not Default: true

RandomRotation

Randomly rotates the image and detections by (min_angle to max_angle) degrees counter-clockwise
FieldTypeLabelDescription
min_angle float optional
 Default: -10
max_angle float optional
 Default: 10
use_keypoints_calc_boxes bool optional
use keypoints to compute new bounding box or not Default: false

RandomRotation90

Randomly rotates the image and detections by 90 degrees counter-clockwise
50% of the time.

RandomVerticalFlip

Randomly vertically flips the image and detections 50% of the time.
FieldTypeLabelDescription
keypoint_flip_permutation int32 repeated
Specifies a mapping from the original keypoint indices to vertically
flipped indices. This is used in the event that keypoints are specified,
in which case when the image is vertically flipped the keypoints will
need to be permuted. E.g. for keypoints representing left_eye, right_eye,
nose_tip, mouth, left_ear, right_ear (in that order), one might specify
the keypoint_flip_permutation below:
keypoint_flip_permutation: 1
keypoint_flip_permutation: 0
keypoint_flip_permutation: 2
keypoint_flip_permutation: 3
keypoint_flip_permutation: 5
keypoint_flip_permutation: 4 

ResizeImage

Resizes images to [new_height, new_width].
FieldTypeLabelDescription
new_height int32 optional
 
new_width int32 optional
 
method ResizeMethod.Enum optional
 Default: BILINEAR

ResizeImageWithFixedHeight

Resizes images to fixed new_height and keep ratio
FieldTypeLabelDescription
new_height int32 optional
 
method ResizeImageWithFixedHeight.Method optional
 Default: BILINEAR

ResizeToRange



        
          
FieldTypeLabelDescription
min_size int32 required
 
max_size int32 required
 
method ResizeMethod.Enum optional
 Default: BILINEAR

SSDRandomCrop

Randomly crops a image according to:
Liu et al., SSD: Single shot multibox detector.
This preprocessing step defines multiple SSDRandomCropOperations. Only one
operation (chosen at random) is actually performed on an image.
FieldTypeLabelDescription
operations SSDRandomCropOperation repeated
 

SSDRandomCropFixedAspectRatio

Randomly crops a image to a fixed aspect ratio according to:
Liu et al., SSD: Single shot multibox detector.
Multiple SSDRandomCropFixedAspectRatioOperations are defined by this
preprocessing step. Only one operation (chosen at random) is actually
performed on an image.
FieldTypeLabelDescription
operations SSDRandomCropFixedAspectRatioOperation repeated
 
aspect_ratio float optional
Aspect ratio to crop to. This value is used for all crop operations. Default: 1

SSDRandomCropFixedAspectRatioOperation



        
          
FieldTypeLabelDescription
min_object_covered float optional
Cropped image must cover at least this fraction of one original bounding
box. 
min_area float optional
The area of the cropped image must be within the range of
[min_area, max_area]. 
max_area float optional
 
overlap_thresh float optional
Cropped box area ratio must be above this threhold to be kept. 
random_coef float optional
Probability a crop operation is skipped. 

SSDRandomCropOperation



        
          
FieldTypeLabelDescription
min_object_covered float optional
Cropped image must cover at least this fraction of one original bounding
box. 
min_aspect_ratio float optional
The aspect ratio of the cropped image must be within the range of
[min_aspect_ratio, max_aspect_ratio]. 
max_aspect_ratio float optional
 
min_area float optional
The area of the cropped image must be within the range of
[min_area, max_area]. 
max_area float optional
 
overlap_thresh float optional
Cropped box area ratio must be above this threhold to be kept. 
random_coef float optional
Probability a crop operation is skipped. 

SSDRandomCropPad

Randomly crops and pads an image according to:
Liu et al., SSD: Single shot multibox detector.
This preprocessing step defines multiple SSDRandomCropPadOperations. Only one
operation (chosen at random) is actually performed on an image.
FieldTypeLabelDescription
operations SSDRandomCropPadOperation repeated
 

SSDRandomCropPadFixedAspectRatio

Randomly crops and pads an image to a fixed aspect ratio according to:
Liu et al., SSD: Single shot multibox detector.
Multiple SSDRandomCropPadFixedAspectRatioOperations are defined by this
preprocessing step. Only one operation (chosen at random) is actually
performed on an image.
FieldTypeLabelDescription
operations SSDRandomCropPadFixedAspectRatioOperation repeated
 
aspect_ratio float optional
Aspect ratio to pad to. This value is used for all crop and pad operations. Default: 1
min_padded_size_ratio float repeated
Min ratio of padded image height and width to the input image's height and
width. Two entries per operation. 
max_padded_size_ratio float repeated
Max ratio of padded image height and width to the input image's height and
width. Two entries per operation. 

SSDRandomCropPadFixedAspectRatioOperation



        
          
FieldTypeLabelDescription
min_object_covered float optional
Cropped image must cover at least this fraction of one original bounding
box. 
min_aspect_ratio float optional
The aspect ratio of the cropped image must be within the range of
[min_aspect_ratio, max_aspect_ratio]. 
max_aspect_ratio float optional
 
min_area float optional
The area of the cropped image must be within the range of
[min_area, max_area]. 
max_area float optional
 
overlap_thresh float optional
Cropped box area ratio must be above this threhold to be kept. 
random_coef float optional
Probability a crop operation is skipped. 

SSDRandomCropPadOperation



        
          
FieldTypeLabelDescription
min_object_covered float optional
Cropped image must cover at least this fraction of one original bounding
box. 
min_aspect_ratio float optional
The aspect ratio of the cropped image must be within the range of
[min_aspect_ratio, max_aspect_ratio]. 
max_aspect_ratio float optional
 
min_area float optional
The area of the cropped image must be within the range of
[min_area, max_area]. 
max_area float optional
 
overlap_thresh float optional
Cropped box area ratio must be above this threhold to be kept. 
random_coef float optional
Probability a crop operation is skipped. 
min_padded_size_ratio float repeated
Min ratio of padded image height and width to the input image's height and
width. Two entries per operation. 
max_padded_size_ratio float repeated
Max ratio of padded image height and width to the input image's height and
width. Two entries per operation. 
pad_color_r float optional
Padding color. 
pad_color_g float optional
 
pad_color_b float optional
 

ScaleBoxesToPixelCoordinates

Scales boxes from normalized coordinates to pixel coordinates.

SubtractChannelMean

Normalizes an image by subtracting a mean from each channel.
FieldTypeLabelDescription
means float repeated
The mean to subtract from each channel. Should be of same dimension of
channels in the input image. 

TemporalCenterCrop

temporal center crop
FieldTypeLabelDescription
sample_duration int32 optional
crop length Default: 16
sample_stride int32 optional
downsampling stride after temporal crop
output length is sample_duration/sample_stride Default: 1

TemporalRandomCrop

temporal random crop
FieldTypeLabelDescription
sample_duration int32 optional
crop length

downsampling stride after temporal crop
output length is sample_duration/sample_stride Default: 16
sample_stride int32 optional
 Default: 1

VggPreprocessing

For classification model, we provide a fixed preprocessing process according to
different backbone, if you want to add more preprocessing step, just use the ones
above, by adding them to data_augmentation_options
FieldTypeLabelDescription
output_width int32 optional
 Default: 224
output_height int32 optional
 Default: 224
is_training bool optional
 Default: false
resize_side_min int32 optional
 Default: 256
resize_side_max int32 optional
 Default: 512

VideoSpatialCenterCrop



        
          
FieldTypeLabelDescription
crop_size int32 repeated
crop size 

VideoSpatialRandomCrop



        
          
FieldTypeLabelDescription
crop_size int32 repeated
crop size 

RandomJitterAspectRatio.Method


        
NameNumberDescription
AREA 1
BICUBIC 2
BILINEAR 3
NEAREST_NEIGHBOR 4

ResizeImageWithFixedHeight.Method


        
NameNumberDescription
AREA 1
BICUBIC 2
BILINEAR 3
NEAREST_NEIGHBOR 4

easy_vision/python/protos/rc3d.proto

Top


      
        

RC3D

Configuration for RegionProposal models, only objectness is predicted
multiclass is not supported
FieldTypeLabelDescription
backbone Backbone required
backbone config 
trpn_head TRPNHead required
rpn head config 
region_feature_extractor Block optional
block reuse part of backbone to extract box feature in second stage 
trcnn_head TRCNNHead required
rcnn head config 

easy_vision/python/protos/rcnn_head.proto

Top


      
        

MRCNNHead



        
          
FieldTypeLabelDescription
input_layer string repeated
 
num_classes int32 required
 
initial_crop_size int32 optional
Output size (width and height are set to be the same) of the initial
bilinear interpolation based cropping during ROI pooling. 
maxpool_kernel_size int32 optional
Kernel size of the max pool op on the cropped feature map during
ROI pooling. 
maxpool_stride int32 optional
Stride of the max pool op on the cropped feature map during ROI pooling. 
third_stage_mask_predictor MaskPredictor optional
Hyperparameters for the third stage mask predictor. 
second_stage_mask_loss_weight float optional
Second stage instance mask loss weight. Default: 1

RCNNHead



        
          
FieldTypeLabelDescription
input_layer string repeated
 
num_classes int32 required
 
initial_crop_size int32 optional
Output size (width and height are set to be the same) of the initial
bilinear interpolation based cropping during ROI pooling. 
maxpool_kernel_size int32 optional
Kernel size of the max pool op on the cropped feature map during
ROI pooling. 
maxpool_stride int32 optional
Stride of the max pool op on the cropped feature map during ROI pooling. 
second_stage_box_predictor BoxPredictor optional
Hyperparameters for the second stage box predictor. If box predictor type
is set to rfcn_box_predictor, a R-FCN model is constructed, otherwise a
Faster R-CNN model is constructed. 
nms_config BatchNonMaxSuppression required
 
second_stage_batch_size int32 optional
The batch size per image used for computing the classification and refined
location loss of the box classifier.
Note that this field is ignored if `hard_example_miner` is configured. Default: 128
second_stage_balance_fraction float optional
Fraction of positive examples to use per image for the box classifier. Default: 0.25
hard_example_miner HardExampleMiner optional
 
second_stage_localization_loss_weight float optional
Second stage RCNN localization loss weight Default: 1
second_stage_classification_loss_weight float optional
Second stage RCNN classification loss weight Default: 1
output_roi_features bool optional
Output detection roi features or not Default: false

easy_vision/python/protos/region_similarity_calculator.proto

Top


      
        

IoaSimilarity

Configuration for intersection-over-area (IOA) similarity calculator.

IouSimilarity

Configuration for intersection-over-union (IOU) similarity calculator.

NegSqDistSimilarity

Configuration for negative squared distance similarity calculator.

RegionSimilarityCalculator

Configuration proto for region similarity calculators. See
core/region_similarity_calculator.py for details.
FieldTypeLabelDescription
neg_sq_dist_similarity NegSqDistSimilarity optional
 
iou_similarity IouSimilarity optional
 
ioa_similarity IoaSimilarity optional
 

easy_vision/python/protos/resize_method.proto

Top


      
        

ResizeMethod

Enumeration type for image resizing methods provided in TensorFlow.

ResizeMethod.Enum


        
NameNumberDescription
AREA 1
Corresponds to tf.image.ResizeMethod.AREA
BICUBIC 2
Corresponds to tf.image.ResizeMethod.BICUBIC
BILINEAR 3
Corresponds to tf.image.ResizeMethod.BILINEAR
NEAREST_NEIGHBOR 4
Corresponds to tf.image.ResizeMethod.NEAREST_NEIGHBOR

easy_vision/python/protos/rnn.proto

Top


      
        

BasicLSTM



        
          
FieldTypeLabelDescription
num_units int32 optional
Hidden unit size. Default: 256
forget_bias float optional
Forget bias for BasicLSTMCell. Default: 1
dropout float optional
Dropout rate (not keep_prob) Default: 0.2

GRU



        
          
FieldTypeLabelDescription
num_units int32 optional
Hidden unit size. Default: 256
dropout float optional
Dropout rate (not keep_prob) Default: 0.2

LayerNormBasicLSTM



        
          
FieldTypeLabelDescription
num_units int32 optional
Hidden unit size. Default: 256
forget_bias float optional
Forget bias for BasicLSTMCell. Default: 1
dropout float optional
Dropout rate (not keep_prob) Default: 0.2

NAS



        
          
FieldTypeLabelDescription
num_units int32 optional
Hidden unit size. Default: 256
dropout float optional
Dropout rate (not keep_prob) Default: 0.2

easy_vision/python/protos/rpn_head.proto

Top


      
        

RPNHead



        
          
FieldTypeLabelDescription
input_layer string repeated
 
box_predictor BoxPredictor required
 
first_stage_minibatch_size int32 optional
The batch size to use for computing the first stage objectness and
location losses. Default: 256
first_stage_positive_balance_fraction float optional
Fraction of positive examples per image for the RPN. Default: 0.5
first_stage_nms_score_threshold float optional
Non max suppression score threshold applied to first stage RPN proposals. Default: 0
first_stage_nms_iou_threshold float optional
Non max suppression IOU threshold applied to first stage RPN proposals. Default: 0.7
first_stage_max_proposals int32 optional
Maximum number of RPN proposals retained after first stage postprocessing. Default: 300
first_stage_anchor_generator AnchorGenerator optional
Anchor generator to compute RPN anchors. 
first_stage_localization_loss_weight float optional
First stage RPN localization loss weight. Default: 1
first_stage_objectness_loss_weight float optional
First stage RPN objectness loss weight. Default: 1
rpn_min_size int32 optional
at postprocessing stage, filter rpn out box, drop all boxes[width/height<rpn_min_size] Default: 0
pre_nms_topn int32 optional
pre nms topn, only valid when large than 0
it should be set large enough, otherwise
it may hurt performance Default: -1
boundary_threshold int32 optional
remove rpn anchors that go outside the image by boundary_threshold pixels
set to -1 or a large value, e.g. 100000, to disable pruning anchors Default: 0

easy_vision/python/protos/seg_decode_head.proto

Top


      
        

SegDecoderHead



        
          
FieldTypeLabelDescription
weight_decay float optional
 Default: 0
batchnorm_trainable bool optional
 Default: true
input_layer string repeated
 
use_separable_conv bool optional
 Default: true
decoder_depth int32 required
 
output_stride int32 required
 Default: 4
num_classes int32 required
 Default: 5
resize_to_original bool optional
whether convert the predictions to original shape
during postprocess Default: true

easy_vision/python/protos/simple_rpn.proto

Top


      
        

SimpleRPN

Configuration for RegionProposal models, only objectness is predicted
multiclass is not supported
FieldTypeLabelDescription
backbone Backbone required
backbone config 
rpn_head RPNHead required
rpn head config 
first_stage_localization_loss_weight float optional
First stage RPN localization loss weight. Default: 1
first_stage_objectness_loss_weight float optional
First stage RPN objectness loss weight. Default: 1

easy_vision/python/protos/ssd.proto

Top


      
        

FPNFeaturemapLayout



        
          
FieldTypeLabelDescription
from_layer string repeated
from which layer to contruct fpn feature map 
layer_depth int32 optional
layer depth for all the fpn feature Default: 256
extra_conv_layers int32 optional
number of layers appened after the pyramid features Default: 0

PPNFeaturemapLayout



        
          
FieldTypeLabelDescription
from_layer string repeated
from which layer to contruct ppn feature map 
num_layers int32 optional
 Default: 6
layer_depth int32 optional
layer depth for all the fpn feature Default: 1024

Ssd

Configuration for Single Shot Detection (SSD) models.
FieldTypeLabelDescription
normalize_method Ssd.NormalizeMethod optional
Method to normalze resized image before feed into backbone Default: SUBMEAN
backbone Backbone required
Backbone configuration 
ssd_head SsdHead required
SSD head configuration 
freeze_batchnorm bool optional
Whether to update batch norm parameters during training or not.
When training with a relative small batch size (e.g. 1), it is
desirable to disable batch norm update and use pretrained batch norm
params.

Note: Some feature extractors are used with canned arg_scopes
(e.g resnet arg scopes).  In these cases training behavior of batch norm
variables may depend on both values of `batch_norm_trainable` and
`is_training`.

When canned arg_scopes are used with feature extractors `conv_hyperparams`
will apply only to the additional layers that are added and are outside the
canned arg_scope. Default: false
inplace_batchnorm_update bool optional
Whether to update batch_norm inplace during training. This is required
for batch norm to work correctly on TPUs. When this is false, user must add
a control dependency on tf.GraphKeys.UPDATE_OPS for train/loss op in order
to update the batch norm moving average parameters. Default: false

SsdFeaturemapLayout



        
          
FieldTypeLabelDescription
from_layer string repeated
from which layer to contruct multi-scale feature map, 
size must equals the size of layer_depth 
layer_depth int32 repeated
Specify each feature map layer depth 

SsdHead



        
          
FieldTypeLabelDescription
num_classes int32 required
Number of classes to predict. 
ssd_featuremap_layout SsdFeaturemapLayout optional
multi-scale feature map used in original ssd paper  https://arxiv.org/abs/1512.02325 
fpn_featuremap_layout FPNFeaturemapLayout optional
use feature pyramid network (https://arxiv.org/abs/1612.03144)
to extract multi-scale feature 
ppn_featuremap_layout PPNFeaturemapLayout optional
use Pooling Pyramid network (https://arxiv.org/abs/1807.03284) 
to extract multi-scale feature 
depth_multiplier float optional
The factor to alter the depth of the channels in the multi-scale feature extraction. Default: 1
min_depth int32 optional
Minimum number of the channels in the multi-scale feature extraction. Default: 16
conv_hyperparams Hyperparams optional
Hyperparameters that affect the layers of feature extractor added on top
of the base feature extractor. 
box_coder BoxCoder optional
Box coder to encode the boxes. 
matcher Matcher optional
Matcher to match groundtruth with anchors. 
similarity_calculator RegionSimilarityCalculator optional
Region similarity calculator to compute similarity of boxes. 
anchor_generator AnchorGenerator optional
Anchor generator to compute anchors. 
box_predictor BoxPredictor optional
Box predictor to attach to the features. 
post_processing PostProcessing optional
Post processing to apply on the predictions. 
negative_class_weight float optional
classification weight to be associated to negative
anchors (default: 1.0). The weight must be in [0., 1.]. Default: 1
normalize_loss_by_num_matches bool optional
Whether to normalize the loss by number of groundtruth boxes that match to
the anchors. Default: true
normalize_loc_loss_by_codesize bool optional
Whether to normalize the localization loss by the code size of the box
encodings. This is applied along with other normalization factors. Default: false
loss Loss optional
Loss configuration for training. 
add_summary bool optional
Whether to summary training related info Default: true

Ssd.NormalizeMethod


        
NameNumberDescription
SUBMEAN 0
DIVIDE_255 1
DIVIDE_255_MULTIPLY_2_MINUS_1 2

easy_vision/python/protos/string_int_label_map.proto

Top
Message to store the mapping from class label strings to class id. Datasets
use string labels to represent classes while the object detection framework
works with class ids. This message maps them so they can be converted back
and forth as needed.

StringIntLabelMap



        
          
FieldTypeLabelDescription
item StringIntLabelMapItem repeated
 

StringIntLabelMapItem



        
          
FieldTypeLabelDescription
name string optional
String name. The most common practice is to set this to a MID or synsets
id. 
id int32 optional
Integer id that maps to the string name above. Label ids should start from
1. 
display_name string optional
Human readable string label. 
ignore_recog bool optional
This label ignore recognition or not in Default: false

easy_vision/python/protos/text_encoder.proto

Top


      
        

CNNLineEncoder



        
          
FieldTypeLabelDescription
cnn_name string optional
cnn class name, if no specified, will degenerate to a LineEncoder 
input_layer string optional
cnn output feature name, default use last layer of cnn 
norm_type NormType optional
normalization layer type Default: BATCH
batchnorm_trainable bool optional
batchnorm trainable or not Default: true
weight_decay float optional
weight_decay for l2 regularization Default: 0.0001

CNNSpatialEncoder



        
          
FieldTypeLabelDescription
cnn_name string optional
cnn class name, if no specified, will degenerate to a SpatialEncoder 
input_layer string optional
cnn output feature name, default use last layer of cnn 
norm_type NormType optional
normalization layer type Default: BATCH
batchnorm_trainable bool optional
batchnorm trainable or not Default: true
weight_decay float optional
weight_decay for l2 regularization Default: 0.0001

CRNNEncoder



        
          
FieldTypeLabelDescription
cnn_name string optional
cnn class name, if no specified, will degenerate to a RNNEncoder 
input_layer string optional
cnn output feature name, default use last layer of cnn 
norm_type NormType optional
normalization layer type Default: BATCH
batchnorm_trainable bool optional
batchnorm trainable or not Default: true
weight_decay float optional
weight_decay for l2 regularization Default: 0.0001
num_layers int32 optional
rnn encoder depth Default: 2
basic_lstm BasicLSTM optional
 
gru GRU optional
 
layer_norm_basic_lstm LayerNormBasicLSTM optional
 
nas NAS optional
 
encoder_type CRNNEncoder.RnnEncoderType optional
uni | bi For bi, we build num_encoder_layers/2 bi-directional layers. Default: UNI
residual bool optional
whether to add residual connections Default: true

TransformerEncoder



        
          
FieldTypeLabelDescription
num_layers int32 required
number of encoder layers 
hidden_size int32 required
hidden units size 
num_heads int32 required
number of attention heads 
filter_size int32 required
hidden size of FeedForwardLayer 
pooling_rate int32 optional
pooling rate of input's width Default: 4
layer_postprocess_dropout float optional
postprocess layer dropout Default: 0.1
attention_dropout float optional
attention layer dropout Default: 0.1
relu_dropout float optional
relu layer dropout Default: 0.1

CRNNEncoder.RnnEncoderType


        
NameNumberDescription
UNI 1
BI 2

easy_vision/python/protos/text_end2end.proto

Top


      
        

FixedHeightFeatureGather



        
          
FieldTypeLabelDescription
input_layer string required
 
height int32 required
feature output with fixed height Default: 8
max_width int32 required
feature filtered with max width Default: 300
visualize_height int32 required
roi visualize images with fixed height Default: 32
visualize_width int32 required
roi visualize images with fixed width Default: 100
num_buckets int32 optional
number of buckets Default: 1
subsample_batch_size int32 optional
batch size of sampled text line when training 

FixedHeightPyramidFeatureGather



        
          
FieldTypeLabelDescription
input_layer string repeated
pyramid input feature s 
height int32 repeated
pyramid roi feature heights, length of height must equal
to input layer, the last value of height is the output height 
max_width int32 required
feature filtered with max width Default: 300
visualize_height int32 required
roi visualize images with fixed height Default: 32
visualize_width int32 required
roi visualize images with fixed width Default: 100
num_buckets int32 optional
number of buckets Default: 1
subsample_batch_size int32 optional
batch size of sampled text line when training 
norm_type NormType optional
normalization layer type Default: BATCH
batchnorm_trainable bool optional
batchnorm trainable or not Default: true
weight_decay float optional
weight_decay for l2 regularization Default: 0.0001
use_se bool optional
use squeeze and excitation layer or not Default: true

FixedSizeFeatureGather



        
          
FieldTypeLabelDescription
input_layer string required
 
height int32 optional
feature output with fixed height Default: 8
width int32 optional
feature output with fixed width Default: 25
visualize_height int32 optional
roi visualize images with fixed height Default: 32
visualize_width int32 optional
roi visualize images with fixed width Default: 100
subsample_batch_size int32 optional
batch size of sampled text line when training 

TextEnd2End



        
          
FieldTypeLabelDescription
backbone Backbone required
backbone config 
fpn FPN optional
FPN 
rpn_head RPNHead optional
rpn head config 
rcnn_head RCNNHead optional
rcnn head config 
fcn_head RCNNHead optional
fcn head config 
fixed_size_feature_gather FixedSizeFeatureGather optional
 
fixed_height_feature_gather FixedHeightFeatureGather optional
 
fixed_height_pyramid_feature_gather FixedHeightPyramidFeatureGather optional
 
keypoint_head TextKeypointHead optional
stn / feature alignment 
attention_head TextAttentionHead optional
attention head config 
ctc_head TextCTCHead optional
ctc head config 
max_inference_num int32 optional
 Default: -1

easy_vision/python/protos/text_head.proto

Top


      
        

TextAttentionHead



        
          
FieldTypeLabelDescription
input_layer string optional
input layer 
crnn_encoder CRNNEncoder optional
 
cnn_line_encoder CNNLineEncoder optional
 
cnn_spatial_encoder CNNSpatialEncoder optional
 
attention_decoder RNNDecoderWithAttention required
rnn attention decoder 
time_major bool optional
whether to use time-major mode,
if time major, features must be [time, batch, ...] style Default: true

TextCTCHead



        
          
FieldTypeLabelDescription
input_layer string optional
input layer 
crnn_encoder CRNNEncoder optional
 
cnn_line_encoder CNNLineEncoder optional
 
cnn_spatial_encoder CNNSpatialEncoder optional
 
ctc_decoder FullyConnectedCTCDecoder required
ctc decoder 
time_major bool optional
whether to use time-major mode,
if time major, features must be [time, batch, ...] style Default: true

TextKeypointHead



        
          
FieldTypeLabelDescription
input_layer string repeated
input layer 
keypoint_predictor KeypointPredictor required
keypoints predictor name 
initial_crop_size int32 required
Output size (width and height are set to be the same) of the initial
bilinear interpolation based cropping during ROI pooling. 
maxpool_kernel_size int32 required
Kernel size of the max pool op on the cropped feature map during
ROI pooling. 
maxpool_stride int32 required
Stride of the max pool op on the cropped feature map during ROI pooling. 
num_keypoints int32 optional
number of key points Default: 4
predict_direction bool optional
predict text direction or not Default: false
direction_trainable bool optional
train text direction predictor or not Default: false
unified_direction bool optional
unify all texts direction when inference or evaluation Default: false
smart_unified_direction bool optional
unify almost all texts direction (except height > 2 * width)
when inference or evaluation Default: false
third_stage_batch_size int32 optional
The batch size per image used for computing the classification and refined
location loss of the box classifier. Default: 128

TextRectificationHead



        
          
FieldTypeLabelDescription
input_layer string optional
input layer 
keypoint_predictor KeypointPredictor required
keypoints predictor name 
num_keypoints int32 optional
number of key points Default: 4
predict_direction bool optional
predict text direction or not Default: true
direction_trainable bool optional
train text direction predictor or not Default: true

TextTransformerHead



        
          
FieldTypeLabelDescription
input_layer string optional
input layer 
transformer_encoder TransformerEncoder required
sequence encoder 
transformer_decoder TransformerDecoder required
ctc decoder 

easy_vision/python/protos/text_krcnn.proto

Top


      
        

TextKRCNN



        
          
FieldTypeLabelDescription
backbone Backbone required
backbone config 
fpn FPN optional
FPN 
rpn_head RPNHead optional
rpn head config 
rcnn_head RCNNHead optional
rcnn head config 
fcn_head RCNNHead optional
fcn head config 
keypoint_head TextKeypointHead optional
keypoint head config 

easy_vision/python/protos/text_recognition.proto

Top


      
        

TextRecognition



        
          
FieldTypeLabelDescription
backbone Backbone required
backbone config 
attention_head TextAttentionHead optional
attention head config 
ctc_head TextCTCHead optional
ctc head config 
transformer_head TextTransformerHead optional
transformer head config 

easy_vision/python/protos/text_rectification.proto

Top


      
        

TextRectification



        
          
FieldTypeLabelDescription
backbone Backbone required
backbone config 
rectification_head TextRectificationHead optional
text rectification head config 

easy_vision/python/protos/train.proto

Top


      
        

TrainConfig

Message for configuring DetectionModel training jobs (train.py).
Next id: 25
optimizer options
FieldTypeLabelDescription
optimizer Optimizer optional
Optimizer used to train the DetectionModel. 
gradient_clipping_by_norm float optional
If greater than 0, clips gradients by this value. Default: 0
bias_grad_multiplier float optional
If greater than 0, multiplies the gradient of bias variables by this
amount. Default: 0
regularization_loss float optional
Whether to add regularization loss to `total_loss`, also called weight_decay Default: 0.0001
num_steps uint32 optional
Number of steps to train the CVModel for. If 0, will train the model
indefinitely. Default: 0
fine_tune_checkpoint string optional
Checkpoint to restore variables from. Typically used to load feature
extractor variables trained outside of object detection. 
fine_tune_checkpoint_type string optional
Type of checkpoint to restore variables from, e.g. 'classification' or
'detection'. Provides extensibility to from_detection_checkpoint. 
fine_tune_ckpt_var_map string optional
 
sync_replicas bool optional
Whether to synchronize replicas during training.
In case so, build a SyncReplicateOptimizer Default: false
startup_delay_steps float optional
Number of training steps between replica startup.
This flag must be set to 0 if sync_replicas is set to true. Default: 15
replicas_to_aggregate int32 optional
Number of replicas to aggregate before making parameter updates. Default: 1
num_worker_replicas int32 optional
Number of worker replicas Default: 1
model_dir string required
train model save dir 
save_checkpoints_steps uint32 optional
Step interval for saving checkpoint Default: 5000
save_summary_steps uint32 optional
Save summaries every this many steps. Default: 100
log_step_count_steps uint32 optional
The frequency global step/sec and the loss will be logged during training. Default: 100
summary_model_vars bool optional
summary model variables or not Default: false
train_distribute string optional
DistributionStrategy, available values are 'mirrored' and 'collective' and 'ess'
- mirrored: MirroredStrategy, single machine and multiple devices;
- collective: CollectiveAllReduceStrategy, multiple machines and multiple devices. 
num_gpus_per_worker int32 optional
Number of gpus per machine Default: 1
write_graph bool optional
write meta graph into graph.pbtxt and summary and checkpoint or not Default: true
is_profiling bool optional
profiling or not Default: false
force_restore_shape_compatible bool optional
if variable shape is incompatible, clip or pad variables in checkpoint Default: false
summary_outputs bool optional
summary output tensor or not Default: false
use_unified_memory bool optional
If true, uses CUDA unified memory for memory allocations. Default: false
sub_learning_rate float optional
sub learning rate, to control the subpart parameters learning rate by this coefficient Default: 0
iter_size_per_step int32 optional
gradient accumulate iter size Default: 1

easy_vision/python/protos/trcnn_head.proto

Top


      
        

TRCNNHead



        
          
FieldTypeLabelDescription
input_layer string repeated
 
num_classes int32 required
 
initial_crop_size int32 optional
Output size (width and height are set to be the same) of the initial
bilinear interpolation based cropping during ROI pooling. 
maxpool_kernel_size int32 optional
Kernel size of the max pool op on the cropped feature map during
ROI pooling. 
maxpool_stride int32 optional
Stride of the max pool op on the cropped feature map during ROI pooling. 
second_stage_box_predictor BoxPredictor optional
Hyperparameters for the second stage box predictor. If box predictor type
is set to rfcn_box_predictor, a R-FCN model is constructed, otherwise a
Faster R-CNN model is constructed. 
nms_config BatchNonMaxSuppression required
 
second_stage_batch_size int32 optional
The batch size per image used for computing the classification and refined
location loss of the box classifier.
Note that this field is ignored if `hard_example_miner` is configured. Default: 128
second_stage_balance_fraction float optional
Fraction of positive examples to use per image for the box classifier. Default: 0.25
hard_example_miner HardExampleMiner optional
 
second_stage_localization_loss_weight float optional
Second stage RCNN localization loss weight Default: 1
second_stage_classification_loss_weight float optional
Second stage RCNN classification loss weight Default: 1

easy_vision/python/protos/trpn_head.proto

Top


      
        

TRPNHead



        
          
FieldTypeLabelDescription
input_layer string repeated
 
box_predictor BoxPredictor required
 
first_stage_minibatch_size int32 optional
The batch size to use for computing the first stage objectness and
location losses. Default: 256
first_stage_positive_balance_fraction float optional
Fraction of positive examples per image for the RPN. Default: 0.5
first_stage_nms_score_threshold float optional
Non max suppression score threshold applied to first stage RPN proposals. Default: 0
first_stage_nms_iou_threshold float optional
Non max suppression IOU threshold applied to first stage RPN proposals. Default: 0.7
first_stage_max_proposals int32 optional
Maximum number of RPN proposals retained after first stage postprocessing. Default: 300
first_stage_anchor_generator AnchorGenerator optional
Anchor generator to compute RPN anchors. 
first_stage_localization_loss_weight float optional
First stage RPN localization loss weight. Default: 1
first_stage_objectness_loss_weight float optional
First stage RPN objectness loss weight. Default: 1
rpn_min_size int32 optional
at postprocessing stage, filter rpn out box, drop all boxes[width/height<rpn_min_size] Default: 0
pre_nms_topn int32 optional
pre nms topn, only valid when large than 0
it should be set large enough, otherwise
it may hurt performance Default: -1

easy_vision/python/protos/user_defined_param.proto

Top


      
        

UserDefinedParam



        
          
FieldTypeLabelDescription
name string required
 
int64_value int64 optional
 
int32_value int32 optional
 
uint64_value uint64 optional
 
uint32_value uint32 optional
 
float_value float optional
 
bool_value bool optional
 
string_value string optional
 

UserDefinedParams



        
          
FieldTypeLabelDescription
param UserDefinedParam repeated
parameter with type float, int64, uint64, bool, string 

easy_vision/python/protos/video_classification.proto

Top


      
        

VideoClassificationModel



        
          
FieldTypeLabelDescription
input_width int32 optional
input width height, if not set, will use default input size instead 
input_height int32 optional
 
backbone Backbone required
Backbone configuration 
num_classes int32 required
Number of classes 
loss ClassificationLoss required
Loss configuration for training 
preprocessing_method string optional
Preprocessing method name, if not set, use the corresponding method for 
the backbone 
add_summary bool optional
Whether to summary training related info Default: true
label_id_offset int32 optional
label_id offset, will be used to subtract from groundtruth class
when calcuating loss amd evaluation Default: 0
class_specific_evaluation bool optional
Whether to add class-specific evaluation Default: false
modal string optional
model input modal  'rgb', 'flow', 'rgb+flow' Default: rgb

easy_vision/python/protos/yolo.proto

Top


      
        

YOLO

Configuration for YOLO models.
FieldTypeLabelDescription
backbone Backbone required
Backbone configuration 
yolo_head YOLOHead required
YOLO head configuration 

YOLOFeaturemapLayout



        
          
FieldTypeLabelDescription
from_layer string repeated
from which layer to contruct multi-scale feature map 
use_pan bool optional
use path aggregation network structure or not Default: false
use_spp bool optional
use spatial pyramid pooling structure or not Default: false
use_sam bool optional
use convolutional spatial attention module or not Default: false
fpn_shrink_channel_before_fusion bool optional
in top_down branch, shrink feature channels to
half of original channels before feature fusion. Default: false
fixed_features_output_dim int32 optional
control yolo head fpn to transform all output featuremaps to have same channels num, =0 means dont transform Default: 0

YOLOHead



        
          
FieldTypeLabelDescription
num_classes int32 required
Number of classes to predict. 
yolo_featuremap_layout YOLOFeaturemapLayout required
YOLO featuremap definition 
conv_hyperparams Hyperparams optional
Hyperparameters that affect the layers of feature extractor added on top
of the base feature extractor. 
box_coder BoxCoder repeated
Box coder to encode the boxes, if the number of box_coder > 1,
the number of box_coder must be equal to the number of feature_maps 
matcher Matcher required
Matcher to match groundtruth with anchors. 
anchor_generator AnchorGenerator optional
Anchor generator to compute anchors. 
box_predictor BoxPredictor required
Box predictor to attach to the features. 
post_processing PostProcessing required
Post processing to apply on the predictions. 
loss Loss optional
Loss configuration for training. 
ignore_threshold float optional
Ignore threshold, prediction box which has iou larger than this threshold
but not match a groundtruth box will not be considered as negative samples Default: 0.5
output_roi_features bool optional
Output detection roi features or not Default: false
roi_feature_depth int32 optional
number of output channels of roi features Default: 512

Scalar Value Types

.proto TypeNotesC++ TypeJava TypePython Type
double double double float
float float float float
int32 Uses variable-length encoding. Inefficient for encoding negative numbers – if your field is likely to have negative values, use sint32 instead. int32 int int
int64 Uses variable-length encoding. Inefficient for encoding negative numbers – if your field is likely to have negative values, use sint64 instead. int64 long int/long
uint32 Uses variable-length encoding. uint32 int int/long
uint64 Uses variable-length encoding. uint64 long int/long
sint32 Uses variable-length encoding. Signed int value. These more efficiently encode negative numbers than regular int32s. int32 int int
sint64 Uses variable-length encoding. Signed int value. These more efficiently encode negative numbers than regular int64s. int64 long int/long
fixed32 Always four bytes. More efficient than uint32 if values are often greater than 2^28. uint32 int int
fixed64 Always eight bytes. More efficient than uint64 if values are often greater than 2^56. uint64 long int/long
sfixed32 Always four bytes. int32 int int
sfixed64 Always eight bytes. int64 long int/long
bool bool boolean boolean
string A string must always contain UTF-8 encoded or 7-bit ASCII text. string String str/unicode
bytes May contain any arbitrary sequence of bytes. string ByteString str