Training models with Panoptic Segmentation in Detectron2

Tutorial on how to train your own models with panoptic segmentation in Detectron2.


14 May 2020, by Boyang XiaAsk a question


Panoptic Segmentation

Introduction

A paper [1] came out April last year describing a method combining semantic segmentation (assigning each pixel a class label) and instance segmentation (finding individual objects with its form and label). Detectron2 offers support for panoptic segmentation since last October and in this tutorial, we'll show how easy it is to train your own model with panoptic segmentation.

[1] Kirillov, Alexander et al. (2019). Panoptic Segmentation. arXiv:1801.00868v3

Prerequisites

We tested this tutorial on Ubuntu 18.04, but it should also work on other systems. The installations of the NVIDIA driver and required dependencies may deviate from the instructions below.

NVIDIA GPU

You need a CUDA-enabled graphic card with at least 11GB GPU memory, e.g. NVIDIA GeForce RTX 2080 Ti, because instance segmentation is extremely memory hungry.

NVIDIA Driver

If NVIDIA driver is not pre-installed, you can install it with sudo apt install nvidia-XXX (XXX is the version, the newest one is 440) if you are using Ubuntu or download the appropriate NVIDIA driver (for Linux) and execute the binary as sudo.

CUDA

On Ubuntu 18.04, install CUDA 10.2 with the following script (from NVIDIA Developer):

wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64/cuda-ubuntu1804.pin
sudo mv cuda-ubuntu1804.pin /etc/apt/preferences.d/cuda-repository-pin-600
wget http://developer.download.nvidia.com/compute/cuda/10.2/Prod/local_installers/cuda-repo-ubuntu1804-10-2-local-10.2.89-440.33.01_1.0-1_amd64.deb
sudo dpkg -i cuda-repo-ubuntu1804-10-2-local-10.2.89-440.33.01_1.0-1_amd64.deb
sudo apt-key add /var/cuda-repo-10-2-local-10.2.89-440.33.01/7fa2af80.pub
sudo apt-get update
sudo apt-get -y install cuda

You find setup instructions for other systems on the NVIDIA Developer website.

Install Detectron2

Dependencies

The current version of Detectron2 requires

  • Python ≥ 3.6
  • PyTorch ≥ 1.4

On Ubuntu, run following lines in Bash (get pip with sudo apt install python3-pip):

# Install PyTorch and other dependencies
pip install --user torch torchvision tensorboard cython
# Install OpenCV (optional)
sudo apt install python3-opencv
pip install --user opencv-python
# Install fvcore
pip install --user 'git+https://github.com/facebookresearch/fvcore'
# Install pycocotools
pip install --user 'git+https://github.com/cocodataset/cocoapi.git#subdirectory=PythonAPI'

Download and install Detectron2

In the newest version (0.1.2) of Detectron2, you need to set the environmental variable CUDA_HOME to the location of the CUDA library. In Ubuntu, it is under /usr/local/cuda-XX.X/.

export FORCE_CUDA="1"
export CUDA_HOME="/usr/local/cuda-10.2/"
git clone https://github.com/facebookresearch/detectron2
cd detectron2
pip install .

If you still encounter problems, check out the official installation guide.

Training the model

We base the tutorial on Detectron2 Beginner's Tutorial and train a balloon detector.

The setup for panoptic segmentation is very similar to instance segmentation. However, as in semantic segmentation, you have to tell Detectron2 the pixel-wise labelling of the whole image, e.g. using an image where the colours encode the labels.

    # ...
    record["height"] = height
    record["width"] = width
    # Pixel-wise segmentation
    record["sem_seg_file_name"] = os.path.join(img_dir, "segmentation", v["filename"])

    # ...

You can generate the mask images with the script provided for this demo.

If you want to visualise the dataset with Detectron's Visualizer, add an empty list of stuff class. "Things" are well-defined countable objects, while "stuff" is amorphous something with a different label than the background.

    # ...
    MetadataCatalog.get("balloon_" + d).set(thing_classes=["balloon"], stuff_classes=[])
    # ...

Otherwise Visualizer complains:

AttributeError: Attribute 'stuff_classes' does not exist in the metadata of 'balloon_train'. Available keys are dict_keys(['name', 'thing_classes']).

Results

The training with the default settings takes a bit more than a minute on an NVIDIA Tesla V100 and requires about 9GiB GPU memory (instance segmentation training takes about 6 GiB). The resulting model does not necessarily perform any better than normal instance segmentation, which given the dataset and task (ballon detection) is no wonder.

However, if you want to train a model that can both detect instances and distinguish between different backgrounds, e.g. sky, ocean and sand on a beach, or street, houses and vegetation in a cityscape, then panoptic segmentation may be the right choice for you.

Parallelisation

Panoptimic segmenation, like semantic segmentation, is very memory hungry and you'll soon encounter the limits, e.g. if you increase the batch size (SOLVER.IMS_PER_BATCH) from 2 to 8:

RuntimeError: CUDA out of memory. Tried to allocate x.xx GiB (GPU 0; xx.xx GiB total capacity; xx.xx GiB already allocated; x.xx GiB free; xx.xx GiB reserved in total by PyTorch)

If you have multiple GPUs, you can use the handy function launch provided by Detectron2 (in module detectron2.engine.launch) to split the training up onto different GPUs:

    launch(
        train, # function to be parallelised across multiple GPUs
        4, # Numer of GPUs per machine
        num_machines=1,
        machine_rank=0,
        dist_url="tcp://127.0.0.1:1234",
        args=(cfg,), # arguments to the function `train'
    )

📌 You find the scripts from this tutorial also in our GitHub repo.

machine learningcomputer visionenglish
Start Demo Contact Us

Latest Blog Posts

How to copy XMP metadata between JPEG images (again)

Copying XMP metadata between images isn't straightforward. Read how it's done correctly.


20x Faster Than NumPy: Mean & Std for uint8 Arrays

How to calculate mean and standard deviation 20 times faster than NumPy for uint8 arrays.


Celantur and Virtual Vehicle Collaborate for Privacy Preserving Driving Technology

Enabling automotive companies to develop AD/ADAS systems while respecting privacy.