Training models with Panoptic Segmentation in Detectron2

Introduction

A paper [1] came out April last year describing a method combining semantic segmentation (assigning each pixel a class label) and instance segmentation (finding individual objects with its form and label). Detectron2 offers support for panoptic segmentation since last October and in this tutorial, we'll show how easy it is to train your own model with panoptic segmentation.

[1] Kirillov, Alexander et al. (2019). Panoptic Segmentation. arXiv:1801.00868v3

Prerequisites

We tested this tutorial on Ubuntu 18.04, but it should also work on other systems. The installations of the NVIDIA driver and required dependencies may deviate from the instructions below.

NVIDIA GPU

You need a CUDA-enabled graphic card with at least 11GB GPU memory, e.g. NVIDIA GeForce RTX 2080 Ti, because instance segmentation is extremely memory hungry.

NVIDIA Driver

If NVIDIA driver is not pre-installed, you can install it with sudo apt install nvidia-XXX (XXX is the version, the newest one is 440) if you are using Ubuntu or download the appropriate NVIDIA driver (for Linux) and execute the binary as sudo.

CUDA

On Ubuntu 18.04, install CUDA 10.2 with the following script (from NVIDIA Developer):

wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64/cuda-ubuntu1804.pin
sudo mv cuda-ubuntu1804.pin /etc/apt/preferences.d/cuda-repository-pin-600
wget http://developer.download.nvidia.com/compute/cuda/10.2/Prod/local_installers/cuda-repo-ubuntu1804-10-2-local-10.2.89-440.33.01_1.0-1_amd64.deb
sudo dpkg -i cuda-repo-ubuntu1804-10-2-local-10.2.89-440.33.01_1.0-1_amd64.deb
sudo apt-key add /var/cuda-repo-10-2-local-10.2.89-440.33.01/7fa2af80.pub
sudo apt-get update
sudo apt-get -y install cuda

You find setup instructions for other systems on the NVIDIA Developer website.

Install Detectron2

Dependencies

The current version of Detectron2 requires

Python ≥ 3.6
PyTorch ≥ 1.4

On Ubuntu, run following lines in Bash (get pip with sudo apt install python3-pip):

# Install PyTorch and other dependencies
pip install --user torch torchvision tensorboard cython
# Install OpenCV (optional)
sudo apt install python3-opencv
pip install --user opencv-python
# Install fvcore
pip install --user 'git+https://github.com/facebookresearch/fvcore'
# Install pycocotools
pip install --user 'git+https://github.com/cocodataset/cocoapi.git#subdirectory=PythonAPI'

Download and install Detectron2

In the newest version (0.1.2) of Detectron2, you need to set the environmental variable CUDA_HOME to the location of the CUDA library. In Ubuntu, it is under /usr/local/cuda-XX.X/.

export FORCE_CUDA="1"
export CUDA_HOME="/usr/local/cuda-10.2/"
git clone https://github.com/facebookresearch/detectron2
cd detectron2
pip install .

If you still encounter problems, check out the official installation guide.

Training the model

We base the tutorial on Detectron2 Beginner's Tutorial and train a balloon detector.

The setup for panoptic segmentation is very similar to instance segmentation. However, as in semantic segmentation, you have to tell Detectron2 the pixel-wise labelling of the whole image, e.g. using an image where the colours encode the labels.

    # ...
    record["height"] = height
    record["width"] = width
    # Pixel-wise segmentation
    record["sem_seg_file_name"] = os.path.join(img_dir, "segmentation", v["filename"])

    # ...

You can generate the mask images with the script provided for this demo.

If you want to visualise the dataset with Detectron's Visualizer, add an empty list of stuff class. "Things" are well-defined countable objects, while "stuff" is amorphous something with a different label than the background.

    # ...
    MetadataCatalog.get("balloon_" + d).set(thing_classes=["balloon"], stuff_classes=[])
    # ...

Otherwise Visualizer complains:

AttributeError: Attribute 'stuff_classes' does not exist in the metadata of 'balloon_train'. Available keys are dict_keys(['name', 'thing_classes']).

Do your image datasets contain personal data like faces or license plates?

Try Celantur automated image and video blurring. Respect individuals' privacy, comply with data privacy laws and avoid hefty fines. Give it a try!

Results

The training with the default settings takes a bit more than a minute on an NVIDIA Tesla V100 and requires about 9GiB GPU memory (instance segmentation training takes about 6 GiB). The resulting model does not necessarily perform any better than normal instance segmentation, which given the dataset and task (ballon detection) is no wonder.

However, if you want to train a model that can both detect instances and distinguish between different backgrounds, e.g. sky, ocean and sand on a beach, or street, houses and vegetation in a cityscape, then panoptic segmentation may be the right choice for you.

Parallelisation

Panoptimic segmenation, like semantic segmentation, is very memory hungry and you'll soon encounter the limits, e.g. if you increase the batch size (SOLVER.IMS_PER_BATCH) from 2 to 8:

RuntimeError: CUDA out of memory. Tried to allocate x.xx GiB (GPU 0; xx.xx GiB total capacity; xx.xx GiB already allocated; x.xx GiB free; xx.xx GiB reserved in total by PyTorch)

If you have multiple GPUs, you can use the handy function launch provided by Detectron2 (in module detectron2.engine.launch) to split the training up onto different GPUs:

    launch(
        train, # function to be parallelised across multiple GPUs
        4, # Numer of GPUs per machine
        num_machines=1,
        machine_rank=0,
        dist_url="tcp://127.0.0.1:1234",
        args=(cfg,), # arguments to the function `train'
    )

📌 You find the scripts from this tutorial also in our GitHub repo.

Expert in image processing optimization?

Take part of our performance engineering technical challenge and win € 150 Amazon voucher! Terms and conditions here.

Training models with Panoptic Segmentation in Detectron2

Introduction

Prerequisites

NVIDIA GPU

NVIDIA Driver

CUDA

Install Detectron2

Dependencies

Download and install Detectron2

Training the model

Do your image datasets contain personal data like faces or license plates?

Results

Parallelisation

Expert in image processing optimization?

Latest Blog Posts

Turning Dashcam Footage into Revenue: Monetizing Data the Responsible Way

Edge AI Learnings: CPU Architectures, GPU Capabilities, and Challenges with Nvidia Jetson

Celantur at Esri UC 2024: Solutions Now Listed on ArcGIS Marketplace

Cloud Service

Web API

Container

Edge

Mobile Mapping

Drones

Automotive & ADAS

Embedded & Edge

Laser Scanning

Image Annotation

Documentation

Success Stories

Blog

Celantur

Jobs Applications Welcome

Trust & Compliance

Technology

Training models with Panoptic Segmentation in Detectron2

Introduction

Prerequisites

NVIDIA GPU

NVIDIA Driver

CUDA

Install Detectron2

Dependencies

Download and install Detectron2

Training the model

Do your image datasets contain personal data like faces or license plates?

Results

Parallelisation

Expert in image processing optimization?

Latest Blog Posts

Turning Dashcam Footage into Revenue: Monetizing Data the Responsible Way

Edge AI Learnings: CPU Architectures, GPU Capabilities, and Challenges with Nvidia Jetson

Celantur at Esri UC 2024: Solutions Now Listed on ArcGIS Marketplace