Notebooks

Merge projects

iPython Project Created 3 months ago Free
How to merge several projects to a single one
Free Signup

Merge projects

This script will merge several projects into a single one.

Input:

  • List of existing projects.

Output:

  • New project with all the data from the input projects.

Configuration

Edit the following settings for your own case

In [1]:
import supervisely_lib as sly
import os
from tqdm import tqdm
In [2]:
src_project_names = ['lemons_annotated', 'roads_annotated']
dst_project_name = 'merged_project'

# Context
team_name = "jupyter_tutorials"
workspace_name = "cookbook"

# Obtain server address and your api_token from environment variables
# Edit those values if you run this notebook on your own PC
address = os.environ['SERVER_ADDRESS']
token = os.environ['API_TOKEN']
In [3]:
# Initialize API object
api = sly.Api(address, token)

Verify input values

Test that context (team / workspace / project) exists

In [4]:
# Get IDs of team and workspace
team = api.team.get_info_by_name(team_name)
if team is None:
    raise RuntimeError("Team {!r} not found".format(team_name))

workspace = api.workspace.get_info_by_name(team.id, workspace_name)
if workspace is None:
    raise RuntimeError("Workspace {!r} not found".format(workspace_name))
    
print("Team: id={}, name={}".format(team.id, team.name))
print("Workspace: id={}, name={}".format(workspace.id, workspace.name))
Out [4]:
Team: id=30, name=jupyter_tutorials
Workspace: id=76, name=cookbook
In [5]:
# Check if all input projects exist
src_projects = []
for project_name in src_project_names:
    src_project = api.project.get_info_by_name(workspace.id, project_name)
    if src_project is not None:
        src_projects.append(src_project)
    else:
        raise RuntimeError("Project {!r} not found".format(project_name))

Create output Project

In [6]:
# Check if destination project name already exists. If yes - generate new free name
if api.project.exists(workspace.id, dst_project_name):
    dst_project_name = api.project.get_free_name(workspace.id, dst_project_name)
    
# create remote project
dst_project = api.project.create(workspace.id, dst_project_name)
print("Project: id={}, name={!r}".format(dst_project.id, dst_project.name))
Out [6]:
Project: id=1327, name='merged_project'

Create output Meta

Generate ProjectMeta for new Project by merging all source ProjectMeta's. The result will contain a union of all the object classes and tags from the input projects, if all those tags and classes respectively are compatible (i.e. classes with the same name must have the same geometry type, and tags with the same name must have the same range of values).

In [7]:
destination_meta = sly.ProjectMeta()
for project in src_projects:
    src_meta_json = api.project.get_meta(project.id)
    src_meta = sly.ProjectMeta.from_json(src_meta_json)
    destination_meta = destination_meta.merge(src_meta)

api.project.update_meta(dst_project.id, destination_meta.to_json())

Create output datasets and add images

In [8]:
api.project.get_info_by_id(dst_project.id)
Out [8]:
ProjectInfo(id=1327, name='merged_project', description='', size='0', readme='', workspace_id=76, created_at='2019-04-11T08:37:35.448Z', updated_at='2019-04-11T08:37:35.448Z')
In [9]:
# add datasets, images and annotations to destination project
for project in src_projects:
    for dataset in api.dataset.get_list(project.id):
        print("Processing: project = {!r}, dataset = {!r}".format(project.name, dataset.name), flush=True)
        
        # generate dataset name in destination project if it exists
        dst_dataset_name = dataset.name
        if api.dataset.exists(dst_project.id, dst_dataset_name):
            dst_dataset_name = api.dataset.get_free_name(dst_project.id, dst_dataset_name)
        
        # create new dataset in destination project
        dst_dataset = api.dataset.create(dst_project.id, dst_dataset_name)
        
        images = api.image.get_list(dataset.id)
        with tqdm(total=len(images)) as progress_bar:
            for batch in sly.batched(images):
                image_ids = [image_info.id for image_info in batch]
                image_names = [image_info.name for image_info in batch]
                
                # get image annotations
                ann_infos = api.annotation.download_batch(dataset.id, image_ids)
                ann_jsons = [ann_info.annotation for ann_info in ann_infos]
                
                # add images to destination dataset by id
                dst_images = api.image.upload_ids(dst_dataset.id, image_names, image_ids)
                
                # upload annotations to destination images
                dst_image_ids = [dst_img_info.id for dst_img_info in dst_images]
                api.annotation.upload_jsons(dst_image_ids, ann_jsons)
                        
                progress_bar.update(len(batch))
Out [9]:
Processing: project = 'lemons_annotated', dataset = 'ds1'
100%|██████████| 6/6 [00:00<00:00, 28.62it/s]
Processing: project = 'roads_annotated', dataset = 'ds1'
100%|██████████| 10/10 [00:00<00:00, 72.17it/s]
In [10]:
print("Project {!r} has been sucessfully uploaded".format(dst_project.name))
print("Number of uploaded images: ", api.project.get_images_count(dst_project.id))
Out [10]:
Project 'merged_project' has been sucessfully uploaded
Number of uploaded images:  16

More Info

ID
27
First released
3 months ago
Last updated
A month ago

Owner

s