Generate dataset¶
Mygym provides this useful tool to help you generate dataset for custom training of vision models YOLACT and VAE. You can configure the dataset by json file, where you first specify the general dataset parameters as type, number of images to generate etc. Next, specify the environment, choose cameras to use for rendering images and set their resolution. Then choose a robot and objects to appear in the scene. You can further control the objects quantity, appearance and location.
A very useful feature that eventualy helps you to achieve better performance with trained vision network is myGym’s randomizer. Randomizer works as a wrapper to your standart myGym environment that enables more advanced setting of the scene. Thanks to randomizer, you can change textures of static and dynamic objects and/or their colors. You can change light conditions such as light intensity, direction or color. Camera randomizer slightly changes camera properties. Joint randomizer enables robots and dynamic objects to change their configuration.
Note
To be able to use texture randomizer, download the texture dataset first:
cd myGym
sh download_textures.sh
Dataset json file¶
You can see an example json file with all parameters here:
{
# directory
"output_folder" : "../myGym/yolact_vision/data/yolact/datasets/my_dataset",
# dataset parameters
"dataset_type" : "coco", #"coco" (for yolact)/ "dope" / "vae"
"make_dataset" : "display", #mode of writing files, "new" (override), "resume" (append),"display" (don't store results)
"num_episodes" : 10000000, #total number of episodes
"num_steps" : 1, #may need more steps per episode, because the arms are moving and the objects are first falling down from above the table
"make_shot_every_frame" : 1, #used as if episode % make_shot_every_frame : 0, so for 60 % 30 it's 3 shots (0, 30, and 60)
"num_episodes_hard_reset" : 40, #hard reset every x episode ensures objects rendering when GUI is on
"autosafe_episode" : 100, #if episode % auto_safe_episode, write json files to directory (prevent data loss when process crashes)
"random_arm_movement" : false,
"active_cameras" : [1,0,1,1,1], #set 1 at a position(=camera number) to save images from this camera
"camera_resolution" : [640,480],
"min_obj_area" : 49, #each object will have at least this pixels visible, to be reasonably recognizable. If not, skip. (49 ~ 7x7pix img)
"train_test_split_pct" : 0.1, #data split, 0.0 = only train, 1.0 = only test
"visualize" : false, #binary masks for each labeled object
# env parameters
"env_name" : "Gym-v0", #name of environment
"workspace" : "table", #name of workspace
"visgym" : true, #whether visualize gym background
"robot" : "jaco", #which robot to show in the scene
"gui_on" : true, #whether the GUI of the simulation should be used or not
"show_bounding_boxes_gui" : false,
"changing_light_gui" : false,
"shadows_on" : true,
"color_dict" : null, #use to make (fixed) object colors - textures and color_randomizer will be suppressed, pass null to ignore
"object_sampling_area" : [0.1, 0.8, 0.4, 1.0, 1.25, 1.35], # xxyyzz
"num_objects_range" : [3,5], #range for random count of sampled objects in each scene (>=0)
# randomization parameters
"seed": 42,
"texture_randomizer": {
"enabled": true,
"exclude": [], #objects that will not be texturized, e.g. "table" or "floor" or "objects"
"seamless": true,
"textures_path": "./envs/dtdseamless/",
"seamless_textures_path": "./envs/dtdseamless/"
},
"light_randomizer": {
"enabled": true,
"randomized_dimensions": {
"light_color": true, "light_direction": true,
"light_distance": true, "light_ambient": true,
"light_diffuse": true, "light_specular": true
}},
"camera_randomizer": {
"enabled": true,
"randomized_dimensions": {"target_position": true},
"shift": [0.1, 0.1, 0.1]},
"color_randomizer": {
"enabled": true,
"exclude": [], #objects that will not be texturized, e.g. "table" or "floor" or "objects"
"randomized_dimensions": {"rgb_color": true, "specular_color": true}},
# objects parameters
#Select here which classes you want to use in the simulator and annotate. Format: [quantity, class_name, class_id]
#If you want to make some classes to be classified as the same, assign them the same value, i.e. screw_round:1, peg:1
#If you want some class to appear statistically more often, increase the quantity
"used_class_names_quantity" : [[1,"jaco",1], [1,"jaco_gripper",2], [1,"car_roof",3], [1,"cube_holes",4], [1,"ex_bucket",5], [1,"hammer",6], [1,"nut",7], [1,"peg_screw",8], [1,"pliers",9], [1,"screw_round",10], [1,"screwdriver",11], [1,"sphere_holes",12],[1,"wafer",13], [1,"wheel",14], [1,"wrench",15]],
#fix colors of objects, when color_dict = object_colors
"object_colors" : {"car_roof": ["yellow"], "cube_holes": ["light_green"], "ex_bucket": ["black"], "hammer": ["red"],
"nut": ["light_blue"], "peg_screw": ["white"], "pliers": ["sepia"], "screw_round": ["light_blue"],
"screwdriver": ["purple"], "sphere_holes": ["gold"], "wafer":["dark_purple"], "wheel":["redwine"], "wrench": ["moccasin"]}
}
-
myGym.generate_dataset.
color_names_to_rgb
()[source]¶ Assign RGB colors to objects by name as specified in the training config file
-
myGym.generate_dataset.
create_coco_json
()[source]¶ Create COCO json data structure
- Returns:
- return data_train
(dict) Data structure for training data
- return data_test
(dist) Data structure for testing data
-
class
myGym.generate_dataset.
GeneratorCoco
[source]¶ Generator class for COCO image dataset for YOLACT vision model training
-
get_env
()[source]¶ Create environment for COCO dataset generation according to dataset config file
- Returns:
- return env
(object) Environment for dataset generation
-
init_data
()[source]¶ Initialize data structures for COCO dataset annotations
- Returns:
- return data_train
(dict) Data structure for training data
- return data_test
(dist) Data structure for testing data
-
resume
()[source]¶ Resume COCO dataset generation
- Returns:
- return data_train
(dict) Training data from preceding dataset generation in COCO data structure
- return data_test
(dist) Testing data from preceding dataset generation in COCO data structure
- return image_id
(int) ID of last generated image in preceding dataset generation
-