Training models with Panoptic Segmentation in Detectron2
Tutorial on how to train your own models with panoptic segmentation in Detectron2.
A paper  came out April last year describing a method combining semantic segmentation (assigning each pixel a class label) and instance segmentation (finding individual objects with its form and label). Detectron2 offers support for panoptic segmentation since last October and in this tutorial, we'll show how easy it is to train your own model with panoptic segmentation.
 Kirillov, Alexander et al. (2019). Panoptic Segmentation. arXiv:1801.00868v3
We tested this tutorial on Ubuntu 18.04, but it should also work on other systems. The installations of the NVIDIA driver and required dependencies may deviate from the instructions below.
You need a CUDA-enabled graphic card with at least 11GB GPU memory, e.g. NVIDIA GeForce RTX 2080 Ti, because instance segmentation is extremely memory hungry.
If NVIDIA driver is not pre-installed, you can install it with
sudo apt install nvidia-XXX (XXX is the version, the newest one is
440) if you are using Ubuntu or
download the appropriate NVIDIA driver (for Linux) and execute the binary as sudo.
On Ubuntu 18.04, install CUDA 10.2 with the following script (from NVIDIA Developer):
wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64/cuda-ubuntu1804.pin sudo mv cuda-ubuntu1804.pin /etc/apt/preferences.d/cuda-repository-pin-600 wget http://developer.download.nvidia.com/compute/cuda/10.2/Prod/local_installers/cuda-repo-ubuntu1804-10-2-local-10.2.89-440.33.01_1.0-1_amd64.deb sudo dpkg -i cuda-repo-ubuntu1804-10-2-local-10.2.89-440.33.01_1.0-1_amd64.deb sudo apt-key add /var/cuda-repo-10-2-local-10.2.89-440.33.01/7fa2af80.pub sudo apt-get update sudo apt-get -y install cuda
You find setup instructions for other systems on the NVIDIA Developer website.
The current version of Detectron2 requires
- Python ≥ 3.6
- PyTorch ≥ 1.4
On Ubuntu, run following lines in Bash (get pip with
sudo apt install python3-pip):
# Install PyTorch and other dependencies pip install --user torch torchvision tensorboard cython # Install OpenCV (optional) sudo apt install python3-opencv pip install --user opencv-python # Install fvcore pip install --user 'git+https://github.com/facebookresearch/fvcore' # Install pycocotools pip install --user 'git+https://github.com/cocodataset/cocoapi.git#subdirectory=PythonAPI'
Download and install Detectron2
In the newest version (0.1.2) of Detectron2, you need to set the environmental variable
CUDA_HOME to the location of the CUDA library. In Ubuntu, it is under
export FORCE_CUDA="1" export CUDA_HOME="/usr/local/cuda-10.2/" git clone https://github.com/facebookresearch/detectron2 cd detectron2 pip install .
If you still encounter problems, check out the official installation guide.
Training the model
We base the tutorial on Detectron2 Beginner's Tutorial and train a balloon detector.
The setup for panoptic segmentation is very similar to instance segmentation. However, as in semantic segmentation, you have to tell Detectron2 the pixel-wise labelling of the whole image, e.g. using an image where the colours encode the labels.
# ... record["height"] = height record["width"] = width # Pixel-wise segmentation record["sem_seg_file_name"] = os.path.join(img_dir, "segmentation", v["filename"]) # ...
You can generate the mask images with the script provided for this demo.
If you want to visualise the dataset with Detectron's
Visualizer, add an empty list of stuff class. "Things" are well-defined countable objects,
while "stuff" is amorphous something with a different label than the background.
# ... MetadataCatalog.get("balloon_" + d).set(thing_classes=["balloon"], stuff_classes=) # ...
AttributeError: Attribute 'stuff_classes' does not exist in the metadata of 'balloon_train'. Available keys are dict_keys(['name', 'thing_classes']).
Do your image datasets contain personal data like faces or license plates?Try Celantur automated image and video blurring. Respect individuals' privacy, comply with data privacy laws and avoid hefty fines. Give it a try!
The training with the default settings takes a bit more than a minute on an NVIDIA Tesla V100 and requires about 9GiB GPU memory (instance segmentation training takes about 6 GiB). The resulting model does not necessarily perform any better than normal instance segmentation, which given the dataset and task (ballon detection) is no wonder.
However, if you want to train a model that can both detect instances and distinguish between different backgrounds, e.g. sky, ocean and sand on a beach, or street, houses and vegetation in a cityscape, then panoptic segmentation may be the right choice for you.
Panoptimic segmenation, like semantic segmentation, is very memory hungry and you'll soon encounter the limits, e.g. if you increase the batch size (SOLVER.IMS_PER_BATCH) from 2 to 8:
RuntimeError: CUDA out of memory. Tried to allocate x.xx GiB (GPU 0; xx.xx GiB total capacity; xx.xx GiB already allocated; x.xx GiB free; xx.xx GiB reserved in total by PyTorch)
If you have multiple GPUs, you can use the handy function
launch provided by Detectron2 (in module
detectron2.engine.launch) to split the training up onto different GPUs:
launch( train, # function to be parallelised across multiple GPUs 4, # Numer of GPUs per machine num_machines=1, machine_rank=0, dist_url="tcp://127.0.0.1:1234", args=(cfg,), # arguments to the function `train' )
📌 You find the scripts from this tutorial also in our GitHub repo.
Expert in image processing optimization?Take part of our performance engineering technical challenge and win € 150 Amazon voucher! Terms and conditions here.
Ask us Anything. We'll get back to you shortly