Guide #01: project structure

iPython Project Created 3 months ago Free
Introduction to fundamental Supervisely SDK classes.
Free Signup

Supervisely Python SDK Tutorial #1

Here we cover the basics of working with annotated images in Supervisely format using our Python SDK: exploring and modifying data projects, and analysing and visualizing the image labeling data, both spatial and categorical. The Python SDK will be most useful when developing custom Supervisely plugins, like custom format imports or new neural network architectures, or when processing datasets in Supervisely format.

We have aimed to design the SDK to make the most frequent data analysis and transformation tasks as easy as possible. Here is a teaser of what you can do with a few lines of code:

In [16]:
import supervisely_lib as sly  # Supervisely Python SDK

# Open existing project on disk.
project = sly.Project('./tutorial_project', sly.OpenMode.READ)
# Locate and load image labeling data.
item_paths = project.datasets.get('dataset_01').get_item_paths('bicycle-car.jpeg')
ann = sly.Annotation.load_json_file(item_paths.ann_path, project.meta)
# Go over the labeled objects and print out basic properties.
for label in ann.labels:
    print('Found label object: ' +
    print('   geometry type: ' + label.geometry.geometry_name())
    print('   object area: ' + str(label.geometry.area))
Out [16]:
Found label object: bike
   geometry type: rectangle
   object area: 88109.0
Found label object: car
   geometry type: polygon
   object area: 323207.0

Another example is rendering the labeled objects as bitmaps:

In [17]:
import numpy as np
from matplotlib import pyplot as plt
%matplotlib inline

# Read the underlying raw image for display.
img =
# Render the labeled objects.
ann_render = np.zeros(ann.img_size + (3,), dtype=np.uint8)
# Separately, render the labeled objects contours.
ann_contours = np.zeros(ann.img_size + (3,), dtype=np.uint8)
ann.draw_contour(ann_contours, thickness=7)

plt.figure(figsize=(15, 15))
plt.subplot(1, 3, 1)
plt.subplot(1, 3, 2)
plt.subplot(1, 3, 3)
Out [17]:
<matplotlib.image.AxesImage at 0x7f1b8db75fd0>
<Figure size 1080x1080 with 3 Axes>

Now that you have had a glimpse of the SDK usage, let us start properly with the basics.

Projects and datasets basics

All data in Supervisely is organized into projects. Every project contains two kinds of data:

  1. labeled images, grouped into datasets.
  2. labeling meta-information (available labeling classes and tags).

This tutorial walks you through using Supervisely Python SDK to work with projects offline (downloaded to local disk). Offline mode is mainly useful when developing new plugins or running Python code directly on the Supervisely instance.

For some lightweight tasks (like just reading the set of labeling classes used in the project), downloading the whole project is too wasteful. Instead, it may be much more efficient to use our online Python API to query and update projects directly on-instance. See this tutorial to get started with the online API.

Hello Project - explore the data

All the project data manipulation with Supervisely SDK starts with a Project class. It is a lightweight wrapper to store the metadata and information on avaliable datasets and images. Let us open an existing toy project:

In [18]:
import supervisely_lib as sly  # Supervisely Python SDK
import json                    # Add Python JSON module for pretty-printing.

# Load the project meta-data.
# This does NOT load the images or their labeling data in memory.
project = sly.Project('./tutorial_project', sly.OpenMode.READ)

With an opened project, we can get some basic information: its name, location on disk, member datasets, the total number of images. We can also iterate over the datasets, and go over the images in each dataset.

In [19]:
# Print basic project metadata.
print("Project name: ",
print("Project directory: ",
print("Total images: ", project.total_items)
print("Dataset names: ", project.datasets.keys())

# What datasets and images are there and where are they on disk?
for dataset in project:
    print("Dataset: ",
    # A dataset item is a pair of an image and its annotation.
    # The annotation contains all the labeling information about
    # the image - segmentation masks, objects bounding boxes etc.
    # We will look at annotations in detail shortly.
    for item_name in dataset:
        img_path = dataset.get_img_path(item_name)
        print("  image: ", img_path)
Out [19]:
Project name:  tutorial_project
Project directory:  ./tutorial_project
Total images:  5
Dataset names:  ['dataset_02', 'dataset_01']

Dataset:  dataset_02
  image:  ./tutorial_project/dataset_02/img/bmw.jpeg
  image:  ./tutorial_project/dataset_02/img/snow-city.jpeg

Dataset:  dataset_01
  image:  ./tutorial_project/dataset_01/img/bike-man-dog.jpeg
  image:  ./tutorial_project/dataset_01/img/car-people-indoors.jpeg
  image:  ./tutorial_project/dataset_01/img/bicycle-car.jpeg

Image annotations and meta-information

Raw images by themselves are not very interesting. What really makes the data valuable is the labeling information. We call all the information about a single image its annotation. Annotations are stored as JSON files (one per image), and are available via Annotation objects in the SDK.

Supervisely supports two kinds of labeling information:

  1. Geometric labels. Available via Label class. Segmentation masks, polygons, object bounding boxes, points on images - anything that has spatial properties is represented as a Label.
  2. Tags, which have no spatial data. Available via Tag class. Tags can be assigned to the whole image or to an individual label.

Finally, it is important that annotations for different images within the same projects use a consistent set of label classes and tags. Suppose half of the images have cars labeled as bounding boxes, and the other half as carefully segmented per-pixel masks. Such a discrepancy is often a sign of problematic data quality. To maintain and enforce consistency between annotations, each Supervisely project explicitly stores meta information, available via ProjectMeta class in the SDK. ProjectMeta defines the labels and tags available for the given project:

  1. For geometric labels, ObjClass objects store semantic class name, geometry type (e.g. bitmap, polygon, rectangle) and color. Each Label refers to its ObjClass object for the semantic information.
  2. For tags, TagMeta objects defne the available tag names, and for each tag the range of possible values (so that you can define traffic_light_color:red to be a valid combination, but traffic_light_color:chair to be invalid).

The metadata available via Project directly matches what you see for that project in the Supervisely UI.

Python SDK ProjectMeta:

In [20]:
Out [20]:
Object Classes
|  Name  |   Shape   |     Color      |
|  bike  | Rectangle | [246, 255, 0]  |
|  car   |  Polygon  | [190, 85, 206] |
|  dog   |  Polygon  |  [253, 0, 0]   |
| person |   Bitmap  |  [0, 255, 18]  |
|      Name     |  Value type  |    Possible values    |
|    situated   | oneof_string | ['inside', 'outside'] |
|      like     |     none     |          None         |
|  cars_number  |  any_number  |          None         |
|   car_color   |  any_string  |          None         |
| person_gender | oneof_string |   ['male', 'female']  |
|  vehicle_age  | oneof_string | ['modern', 'vintage'] |

Compare with Supervisely UI label meta: object classes UI tags meta: tag metas

Load an image annotation and read off basic information:

In [21]:
# Grab the file paths for both raw image and annotation in one call.
item_paths = project.datasets.get('dataset_01').get_item_paths('bicycle-car.jpeg')

# Load and deserialize annotation from JSON format.
# Annotation data is cross-checked again project meta, and references to
# the right LabelMeta and TagMeta objects are set up.
ann = sly.Annotation.load_json_file(item_paths.ann_path, project.meta)

print('Loaded annotation has {} labels and {} image tags.'.format(len(ann.labels), len(ann.img_tags)))
print('Label class names: ' + (', '.join( for label in ann.labels)))
print('Image tags: ' + (', '.join(tag.get_compact_str() for tag in ann.img_tags)))
Out [21]:
Loaded annotation has 2 labels and 2 image tags.
Label class names: bike, car
Image tags: like, cars_number:1

We can easily render all the labels in the annotation to a bitmap for visualization:

In [22]:
# Basic imaging functionality and Jupyter image display helpers.
import numpy as np
from matplotlib import pyplot as plt
%matplotlib inline

# A helper to display several images in a row.
# Can be safely skipped - not essentiall for understanding the rest of the code.
def display_images(images, figsize=None):
    plt.figure(figsize=(figsize if (figsize is not None) else (15, 15)))
    for i, img in enumerate(images, start=1):
        plt.subplot(1, len(images), i)

# Set up a 3-channel black canvas to render annotation labels on.
# Make the canvas size match the original image size.
ann_render = np.zeros(ann.img_size + (3,), dtype=np.uint8)

# Render all the labels using colors from the meta information.

# Set up canvas to draw label contours.
ann_contours = np.zeros(ann.img_size + (3,), dtype=np.uint8)

# Draw thick contours for the labels on a separate canvas.
ann.draw_contour(ann_contours, thickness=7)

# Load the original image too.
img =

# Display everything.
display_images([img, ann_render, ann_contours])
Out [22]:
<Figure size 1080x1080 with 3 Axes>

Next we turn to working with labeling data on a finer grained level, dealing with individual semantic objects within the annotation.

Geometric labels

A Label is the primary type that we work with when processing annotations. A label is a combination of three main compontnets:

  1. ObjClass instance from the project meta indicating the semantic class of the label.
  2. Geometrical data, represented as a specific object (Bitmap, Rectangle, Polygon) that inherits from a common Geometry interface.
  3. Other meta-information (tags, description).

Let us first explore existing labels:

In [23]:
for label in ann.labels:
    print('Label class: {}; type: {}; label area: {}'.format(,
    for tag in label.tags:
        print('Tag: ' + tag.get_compact_str())
Out [23]:
Label class: bike; type: rectangle; label area: 88109.0

Label class: car; type: polygon; label area: 323207.0
Tag: vehicle_age:vintage
Tag: car_color:red

One can easily filter by the all the available properties. For example, let us find all the images in the dataset that contain cars where the cars are large enough:

In [24]:
filtered_items = []
for dataset in project:
    for item_name in dataset:
        ann_path = dataset.get_ann_path(item_name)
        ann = sly.Annotation.load_json_file(ann_path, project.meta)
        if any( == 'car' and label.geometry.area > 200000
               for label in ann.labels):
            filtered_items.append((, item_name))
Out [24]:
[('dataset_01', 'bicycle-car.jpeg'), ('dataset_01', 'car-people-indoors.jpeg')]

Label rendering and rasterization

A typical step in training segmentation neural networks is marking every pixel of a training image with its semantic class. With Supervisely SDK, this can be done in a uniform way, regardless of whether the underlying label is a rectabgle, polygon or a binary mask:

In [25]:
# Load the data and set up black canvas.
item_paths = project.datasets.get('dataset_01').get_item_paths('bicycle-car.jpeg')
ann = sly.Annotation.load_json_file(item_paths.ann_path, project.meta)
img =
rendered_labels = np.zeros(ann.img_size + (3,), dtype=np.uint8)

for label in ann.labels:
    print('Label type: ' + label.geometry.geometry_name())
    # Same call for any geometry type.

display_images([img, rendered_labels])
Out [25]:
Label type: rectangle
Label type: polygon