Notebooks

Train-validation tagging

iPython Project Created 4 days ago Free
How to split training dataset to train/validation using tags
Free Signup

Train-validation tagging

How to split training dataset to train/validation using tags

Input:

  • Source Project
  • Train-validation split ratio

Output:

  • New Project with images randomly tagged by train or val, based on split ration

Configuration

Edit the following settings for your own case

In [1]:
import supervisely_lib as sly
from tqdm import tqdm
import random
import os
In [2]:
team_name = "jupyter_tutorials"
workspace_name = "cookbook"
project_name = "tutorial_project"

dst_project_name = "tutorial_project_tagged"

validation_portion = 0.4

tag_meta_train = sly.TagMeta('train', sly.TagValueType.NONE)
tag_meta_val = sly.TagMeta('val', sly.TagValueType.NONE)

# Obtain server address and your api_token from environment variables
# Edit those values if you run this notebook on your own PC
address = os.environ['SERVER_ADDRESS']
token = os.environ['API_TOKEN']
In [3]:
# Initialize API object
api = sly.Api(address, token)

Verify input values

Test that context (team / workspace / project) exists

In [4]:
# Get IDs of team, workspace and project by names

team = api.team.get_info_by_name(team_name)
if team is None:
    raise RuntimeError("Team {!r} not found".format(team_name))

workspace = api.workspace.get_info_by_name(team.id, workspace_name)
if workspace is None:
    raise RuntimeError("Workspace {!r} not found".format(workspace_name))
    
project = api.project.get_info_by_name(workspace.id, project_name)
if project is None:
    raise RuntimeError("Project {!r} not found".format(project_name))
    
print("Team: id={}, name={}".format(team.id, team.name))
print("Workspace: id={}, name={}".format(workspace.id, workspace.name))
print("Project: id={}, name={}".format(project.id, project.name))
Out [4]:
Team: id=30, name=jupyter_tutorials
Workspace: id=76, name=cookbook
Project: id=898, name=tutorial_project

Get Source ProjectMeta

In [5]:
project = api.project.get_info_by_name(workspace.id, project_name)
meta_json = api.project.get_meta(project.id)
meta = sly.ProjectMeta.from_json(meta_json)
print("Source ProjectMeta: \n", meta)
Out [5]:
Source ProjectMeta: 
 ProjectMeta:
Object Classes
+--------+-----------+----------------+
|  Name  |   Shape   |     Color      |
+--------+-----------+----------------+
|  bike  | Rectangle | [246, 255, 0]  |
|  car   |  Polygon  | [190, 85, 206] |
|  dog   |  Polygon  |  [253, 0, 0]   |
| person |   Bitmap  |  [0, 255, 18]  |
+--------+-----------+----------------+
Image Tags
+-------------+--------------+-----------------------+
|     Name    |  Value type  |    Possible values    |
+-------------+--------------+-----------------------+
|   situated  | oneof_string | ['inside', 'outside'] |
|     like    |     none     |          None         |
| cars_number |  any_number  |          None         |
+-------------+--------------+-----------------------+
Object Tags
+---------------+--------------+-----------------------+
|      Name     |  Value type  |    Possible values    |
+---------------+--------------+-----------------------+
| person_gender | oneof_string |   ['male', 'female']  |
|  vehicle_age  | oneof_string | ['modern', 'vintage'] |
|   car_color   |  any_string  |          None         |
+---------------+--------------+-----------------------+

Construct Destination ProjectMeta

In [6]:
def process_meta(input_meta):
    output_meta = input_meta.clone()    
    output_meta = output_meta.add_img_tag_meta(tag_meta_train)
    output_meta = output_meta.add_img_tag_meta(tag_meta_val)
    return output_meta
In [7]:
dst_meta = process_meta(meta)
print("Destination ProjectMeta:\n", dst_meta)
Out [7]:
Destination ProjectMeta:
 ProjectMeta:
Object Classes
+--------+-----------+----------------+
|  Name  |   Shape   |     Color      |
+--------+-----------+----------------+
|  bike  | Rectangle | [246, 255, 0]  |
|  car   |  Polygon  | [190, 85, 206] |
|  dog   |  Polygon  |  [253, 0, 0]   |
| person |   Bitmap  |  [0, 255, 18]  |
+--------+-----------+----------------+
Image Tags
+-------------+--------------+-----------------------+
|     Name    |  Value type  |    Possible values    |
+-------------+--------------+-----------------------+
|   situated  | oneof_string | ['inside', 'outside'] |
|     like    |     none     |          None         |
| cars_number |  any_number  |          None         |
|    train    |     none     |          None         |
|     val     |     none     |          None         |
+-------------+--------------+-----------------------+
Object Tags
+---------------+--------------+-----------------------+
|      Name     |  Value type  |    Possible values    |
+---------------+--------------+-----------------------+
| person_gender | oneof_string |   ['male', 'female']  |
|  vehicle_age  | oneof_string | ['modern', 'vintage'] |
|   car_color   |  any_string  |          None         |
+---------------+--------------+-----------------------+

Create Destination project

In [8]:
# check if destination project already exists. If yes - generate new free name
if api.project.exists(workspace.id, dst_project_name):
    dst_project_name = api.project.get_free_name(workspace.id, dst_project_name)
print("Destination project name: ", dst_project_name)
Out [8]:
Destination project name:  tutorial_project_tagged_003
In [9]:
dst_project = api.project.create(workspace.id, dst_project_name)
api.project.update_meta(dst_project.id, dst_meta.to_json())
print("Destination project has been created: id={}, name={!r}".format(dst_project.id, dst_project.name))
Out [9]:
Destination project has been created: id=1146, name='tutorial_project_tagged_003'

Iterate over all images, tag them and add to destination project

In [10]:
for dataset in api.dataset.get_list(project.id):
    print('Dataset: {}'.format(dataset.name))
    dst_dataset = api.dataset.create(dst_project.id, dataset.name)

    for image in tqdm(api.image.get_list(dataset.id)):
        ann_json = api.annotation.download(image.id).annotation
        ann = sly.Annotation.from_json(ann_json, meta)
        
        tag = sly.Tag(tag_meta_val) if random.random() <= validation_portion else sly.Tag(tag_meta_train)
        ann = ann.add_tag(tag)
        
        dst_image = api.image.add(dst_dataset.id, image.name, image.hash)
        api.annotation.upload(dst_image.id, ann.to_json())
Out [10]:
 33%|███▎      | 1/3 [00:00<00:00,  7.26it/s]
Dataset: dataset_01
100%|██████████| 3/3 [00:00<00:00,  7.05it/s]
  0%|          | 0/2 [00:00<?, ?it/s]
Dataset: dataset_02
100%|██████████| 2/2 [00:00<00:00, 10.77it/s]
In [11]:
print("Project {!r} has been sucessfully uploaded".format(dst_project.name))
print("Number of images: ", api.project.get_images_count(dst_project.id))
Out [11]:
Project 'tutorial_project_tagged_003' has been sucessfully uploaded
Number of images:  5

More Info

ID
28
First released
4 days ago
Last updated
3 hours ago

Owner

s