Setting up and Running StyleGAN2
A short tutorial on setting up StyleGAN2 including troubleshooting.
At Celantur, we use deep learning to anonymise objects in images and videos for data protection. We often share insights from our work in this blog, like how to Dockerise CUDA or how to do Panoptic Segmentation in Detectron2.
In this blog post, we want to guide you through setting up StyleGAN2 from NVIDIA Research, a synthetic image generator.
 Karras T. (2020). Analyzing and Improving the Image Quality of StyleGAN. arXiv:1912.04958
- We tested this tutorial on Ubuntu 18.04, but it should also work on other systems.
- You need a CUDA-enabled graphic card with at least 16GB GPU memory, e.g. NVIDIA Tesla V100.
- StyleGAN2 requires older version of CUDA (v10.0) and TensorFlow (v.1.14 - v1.15) to run.
Setting up CUDA Toolkit 10.0
On Ubuntu 18.04, install CUDA 10.0 with the following script (from NVIDIA Developer):
wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64/cuda-ubuntu1804.pin sudo mv cuda-ubuntu1804.pin /etc/apt/preferences.d/cuda-repository-pin-600 sudo apt-key adv --fetch-keys https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64/7fa2af80.pub sudo add-apt-repository "deb http://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64/ /" sudo apt-get update sudo apt-get install cuda-10-0
The latest NVIDIA driver
nvidia-driver-450 is a transient dependency of the package
cuda-10-0 and will be automatically installed.
You can set up CUDA 10.0 in parallel with newer CUDA versions, which are installed in
/usr/local/cudalinks to the latest installed version.
So if you want to use StyleGAN2 in parallel with a different framework, e.g. Detectron2, that requires CUDA 10.2+, be careful
to set the environmental variable
You find setup instructions for other systems on the NVIDIA Developer website.
Install TensorFlow 1.15
You need an older version of TensorFlow (v1.15) and Python (v3.6) to run StyleGAN2. I highly recommend making use of a package management system like conda, so that you can operate different Python and TensorFlow versions on the same OS.
Once conda is installed, you can set up a new Python3.6 environment named "stylegan2" with
conda create -n stylegan2 python==3.6.9 # and activates it conda activate stylegan2`.
Install GPU-capable TensorFlow and StyleGAN's dependencies:
pip install scipy==1.3.3 requests==2.22.0 Pillow==6.2.1 pip install tensorflow-gpu==1.15.3
⚠️ IMPORTANT: If you install the CPU-only TensorFlow (without
-gpu), StyleGAN2 will not find your GPU notwithstanding properly installed CUDA toolkit and GPU driver.
Set up StyleGAN2
Download StyleGAN2 from Github:
git clone https://github.com/NVlabs/stylegan2.git
Test that NVCC — required for compiling TensorFlow ops — runs properly. NVCC comes with your CUDA installation, so don't install any extra packages!
It resides in
/usr/local/cuda/bin and it's the best to add this directory to your
echo 'export PATH=/usr/local/cuda/bin:$PATH' >>~/.bashrc
Restart the Bash session and run in the folder
nvcc test_nvcc.cu -o test_nvcc -run # CORRECT OUTPUT: # CPU says hello. # GPU says hello.
Use pre-trained networks to generate some synthetic faces:
python run_generator.py generate-images --network=gdrive:networks/stylegan2-ffhq-config-f.pkl \ --seeds=6600-6625 --truncation-psi=0.5
You find the output in the subdirectory
Transform existing images
If you want to transform existing images, you need to prepare them beforehand.
In this tutorial, we use a pre-trained network for portait photos which requires an image format of 1024x1024 (generally the resolution must be a power of 2), thus first convert your portraits that you want to manipulate into that format.
- You can use the ImageMagick tool
convert input.jpeg -resize 1024x1024 input-resized.jpeg
- Then generate the TFRecords:
# datasets/images: Source directory of the images in JPEG or PNG. # datasets/tfrecords: Output directory for the TFRecords. python dataset_tool.py create_from_images datasets/tfrecords/ datasets/images/
⚠️ IMPORTANT: Image must not contain an alpha channel!
When using PNG format, be careful that the images do not include transparency, which requires an additional alpha channel. StyleGAN2 accepts images with only one color channel (grayscale) or three channels (RGB).
- Projection to latent space.
Use the TFRecords for the projection to latent space.
# --data-dir: root dir of datasets. # --dataset: subdirectory where the TFRecords are stored. python run_projector.py project-real-images --network=gdrive:networks/stylegan2-ffhq-config-f.pkl --dataset=tfrecords --data-dir=datasets --num-images 3
--num-images is 3 by default. If you have fewer images in your dataset, you'll get an
The script saves snapshot images to
results during the projection process and you can see how it converges to the original.
You can adapt the hyperparameters in the constructor of
projector.py, e.g. number of training steps and learning rate.
If you want to save the representation in the latent space as well, add the following line to
def project_image(proj, targets, png_prefix, num_snapshots): # ... # while proj.get_cur_step() < proj.num_steps: # ... # if proj.get_cur_step() in snapshot_steps: # ADD THE LINE BELOW TO PICKLE THE LATENT REPRESENTATION misc.save_pkl(proj.get_dlatents(), png_prefix + 'step%04d.pkl' % proj.get_cur_step()) # ADD THE LINE ABOVE TO PICKLE THE LATENT REPRESENTATION misc.save_image_grid(proj.get_images(), png_prefix + 'step%04d.png' % proj.get_cur_step(), drange=[-1,1])
You'll get the representation as a pickled NumPy array, which you can use to modify your original picture.
Machine Learning at Celantur
If you find a bug in this tutorial or are interested in creating state-of-the-art ML models and deploying them in a high-availability and high-scalability cloud environment, drop us a short message and have a chat with us!
nvccdoes not work properly.
Solution: It depends on the config file
nvcc.profileand other executables in
/usr/local/cuda-xx-x/bin. Thus a symbolic link to
~/.local/binwon't work. Add
- Problem: I have installed CUDA Toolkit and the NVIDIA driver.
nvidia-smishows the correct GPU. Why does Tensorflow complain that it cannot find the GPU?
Solution: Did you install the package
tensorflowis the CPU-only version.
- Problem: When I try to generate the TFRcords, it complains, "Input images must be stored as RGB or grayscale"
Solution: Remove the alpha channel and transparency, e.g. convert a PNG to JPEG.
OutOfRangeErrorduring projection into latent space.
Solution: Explicitly set the parameter
--num-imagesto the number of images in the dataset.