Setting up and Running StyleGAN2

A short tutorial on setting up StyleGAN2 including troubleshooting.


29 July 2020, by Boyang XiaAsk a question


Introduction

At Celantur, we use deep learning to anonymise objects in images and videos for data protection. We often share insights from our work in this blog, like how to Dockerise CUDA or how to do Panoptic Segmentation in Detectron2.

In this blog post, we want to guide you through setting up StyleGAN2[1] from NVIDIA Research, a synthetic image generator.

[1] Karras T. (2020). Analyzing and Improving the Image Quality of StyleGAN. arXiv:1912.04958

Prerequisites

  • We tested this tutorial on Ubuntu 18.04, but it should also work on other systems.
  • You need a CUDA-enabled graphic card with at least 16GB GPU memory, e.g. NVIDIA Tesla V100.
  • StyleGAN2 requires older version of CUDA (v10.0) and TensorFlow (v.1.14 - v1.15) to run.

Setting up CUDA Toolkit 10.0

On Ubuntu 18.04, install CUDA 10.0 with the following script (from NVIDIA Developer):

wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64/cuda-ubuntu1804.pin
sudo mv cuda-ubuntu1804.pin /etc/apt/preferences.d/cuda-repository-pin-600
sudo apt-key adv --fetch-keys https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64/7fa2af80.pub
sudo add-apt-repository "deb http://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64/ /"
sudo apt-get update
sudo apt-get install cuda-10-0

The latest NVIDIA driver nvidia-driver-450 is a transient dependency of the package cuda-10-0 and will be automatically installed.

You can set up CUDA 10.0 in parallel with newer CUDA versions, which are installed in /usr/local/cuda-xx-x/.

⚠️ NOTE: /usr/local/cuda links to the latest installed version.

So if you want to use StyleGAN2 in parallel with a different framework, e.g. Detectron2, that requires CUDA 10.2+, be careful to set the environmental variable CUDA_HOME correctly.

You find setup instructions for other systems on the NVIDIA Developer website.

Install TensorFlow 1.15

You need an older version of TensorFlow (v1.15) and Python (v3.6) to run StyleGAN2. I highly recommend making use of a package management system like conda, so that you can operate different Python and TensorFlow versions on the same OS.

Once conda is installed, you can set up a new Python3.6 environment named "stylegan2" with

conda create -n stylegan2 python==3.6.9
# and activates it
conda activate stylegan2`. 

Install GPU-capable TensorFlow and StyleGAN's dependencies:

pip install scipy==1.3.3 requests==2.22.0 Pillow==6.2.1
pip install tensorflow-gpu==1.15.3

⚠️ IMPORTANT: If you install the CPU-only TensorFlow (without -gpu), StyleGAN2 will not find your GPU notwithstanding properly installed CUDA toolkit and GPU driver.

Set up StyleGAN2

Download StyleGAN2 from Github:

git clone https://github.com/NVlabs/stylegan2.git

NVCC

Test that NVCC — required for compiling TensorFlow ops — runs properly. NVCC comes with your CUDA installation, so don't install any extra packages! It resides in /usr/local/cuda/bin and it's the best to add this directory to your PATH in .bashrc:

echo 'export PATH=/usr/local/cuda/bin:$PATH' >>~/.bashrc

Restart the Bash session and run in the folder stylegan2:

nvcc test_nvcc.cu -o test_nvcc -run
# CORRECT OUTPUT:
# CPU says hello.
# GPU says hello.

Image Synthesis

Use pre-trained networks to generate some synthetic faces:

python run_generator.py generate-images --network=gdrive:networks/stylegan2-ffhq-config-f.pkl \
  --seeds=6600-6625 --truncation-psi=0.5
You find the output in the subdirectory `results`.

Transform existing images

If you want to transform existing images, you need to prepare them beforehand.

In this tutorial, we use a pre-trained network for portait photos which requires an image format of 1024x1024 (generally the resolution must be a power of 2), thus first convert your portraits that you want to manipulate into that format.

  1. You can use the ImageMagick tool convert:
convert input.jpeg -resize 1024x1024 input-resized.jpeg
  1. Then generate the TFRecords:
# datasets/images: Source directory of the images in JPEG or PNG.
# datasets/tfrecords: Output directory for the TFRecords.
python dataset_tool.py create_from_images datasets/tfrecords/ datasets/images/

⚠️ IMPORTANT: Image must not contain an alpha channel!

When using PNG format, be careful that the images do not include transparency, which requires an additional alpha channel. StyleGAN2 accepts images with only one color channel (grayscale) or three channels (RGB).

  1. Projection to latent space.

Use the TFRecords for the projection to latent space.

# --data-dir: root dir of datasets.
# --dataset: subdirectory where the TFRecords are stored.
python run_projector.py project-real-images --network=gdrive:networks/stylegan2-ffhq-config-f.pkl --dataset=tfrecords --data-dir=datasets --num-images 3

The parameter --num-images is 3 by default. If you have fewer images in your dataset, you'll get an OutOfRangeError.

The script saves snapshot images to results during the projection process and you can see how it converges to the original. You can adapt the hyperparameters in the constructor of projector.py, e.g. number of training steps and learning rate.

If you want to save the representation in the latent space as well, add the following line to run_projector.py:

def project_image(proj, targets, png_prefix, num_snapshots):
    # ... #
    while proj.get_cur_step() < proj.num_steps:
        # ... #
        if proj.get_cur_step() in snapshot_steps:

            # ADD THE LINE BELOW TO PICKLE THE LATENT REPRESENTATION
            misc.save_pkl(proj.get_dlatents(), png_prefix + 'step%04d.pkl' % proj.get_cur_step())
            # ADD THE LINE ABOVE TO PICKLE THE LATENT REPRESENTATION

            misc.save_image_grid(proj.get_images(), png_prefix + 'step%04d.png' % proj.get_cur_step(), drange=[-1,1])

You'll get the representation as a pickled NumPy array, which you can use to modify your original picture.

Machine Learning at Celantur

If you find a bug in this tutorial or are interested in creating state-of-the-art ML models and deploying them in a high-availability and high-scalability cloud environment, drop us a short message and have a chat with us!

Troubleshooting

  1. Problem: nvcc does not work properly.
    Solution: It depends on the config file nvcc.profile and other executables in /usr/local/cuda-xx-x/bin. Thus a symbolic link to nvcc in ~/.local/bin won't work. Add /usr/local/cuda-xx-x/bin to your PATH.
  2. Problem: I have installed CUDA Toolkit and the NVIDIA driver. nvcc works, nvidia-smi shows the correct GPU. Why does Tensorflow complain that it cannot find the GPU?
    Solution: Did you install the package tensorflow-gpu? tensorflow is the CPU-only version.
  3. Problem: When I try to generate the TFRcords, it complains, "Input images must be stored as RGB or grayscale"
    Solution: Remove the alpha channel and transparency, e.g. convert a PNG to JPEG.
  4. Problem: OutOfRangeError during projection into latent space.
    Solution: Explicitly set the parameter --num-images to the number of images in the dataset.
deep learningGANtutorialcomputer vision
Start Demo Contact Us

Latest Blog Posts

Using object tracking to combat flickering detections in videos

How to decrease the amount of flickering detections in videos with object tracking.


How to copy XMP metadata between JPEG images (again)

Copying XMP metadata between images isn't straightforward. Read how it's done correctly.


20x Faster Than NumPy: Mean & Std for uint8 Arrays

How to calculate mean and standard deviation 20 times faster than NumPy for uint8 arrays.