CUDA GPU Containers on Windows with WSL2

By Staff

Version 2 of the Microsoft Windows Subsystem for Linux (WSL2) now supports GPU computing with CUDA. With the –nvccli functionality introduced in Singularity 3.9 you can now run GPU containers with SingularityCE or SingularityPRO on your Windows system that has a supported NVIDIA GPU.

Let’s run through the steps needed to train a TensorFlow model using CUDA under WSL2. This guide was prepared on a Lenovo laptop running Windows 11, with an RTX3050 GPU. Windows 10 is also supported.

Install WSL2

WSL2 provides a Linux environment on Windows, leveraging Hyper-V and a full Linux Kernel. Because WSL2 uses a true Linux kernel, rather than syscall translation in the original WSL, container platforms like Singularity are supported.

Follow the WSL2 installation instructions to enable WSL2 with the default Ubuntu 20.04 environment. On Windows 11 and the most recent builds of Windows 10 this is as easy as opening an administrator command prompt or Powershell window and entering:

wsl --install

Follow the prompts. A restart is required, and when you open the ‘Ubuntu’ app for the first time you’ll be asked to set a username and password for the Linux environment.

If you are using an older build of Windows 10 more steps are required, as set out in the link above.

Install Latest NVIDIA Drivers

Make sure that the NVIDIA drivers installed on your Windows system are up-to-date. You can download the latest drivers here

If you updated your drivers, restart your computer before you continue.

Install libnvidia-container-tools

Open up your ‘Ubuntu’ app to drop into a WSL2 session. First let’s install the NVIDIA utilities that Singularity’s --nvccli functionality uses to setup the GPU with WSL2.

Instructions to setup the package repository are given on the libnvidia-container website for a range of distributions. For the WSL2 default of Ubuntu 20.04 you can just use the commands below:

# Fetch and add trhe signing key
curl -s -L https://nvidia.github.io/libnvidia-container/gpgkey | \
  sudo apt-key add -

# Fetch the repository file
curl -s -L https://nvidia.github.io/libnvidia-container/ubuntu20.04/libnvidia-container.list | \
  sudo tee /etc/apt/sources.list.d/libnvidia-container.list

# Get the metadata from the new repositories
sudo apt update

# Install the package we need
sudo apt install libnvidia-container-tools

Install SingularityCE/PRO

You can install SingularityCE from source, or from the packages at the GitHub releases page. Note that SingularityCE 3.9.7 is required to support the latest versions of libnvidia-container-tools.

To quickly install the 3.9.7 package:

wget https://github.com/sylabs/singularity/releases/download/v3.9.7/singularity-ce_3.9.7-focal_amd64.deb
sudo apt install ./singularity-ce_3.9.7-focal_amd64.deb
If you are a SingularityPRO subscriber, install the latest version of SingularityPRO 3.9 from the Sylabs package repositories, as detailed in the provided documentation.

Configure Singularity

Because the --nvccli functionality is a new, and still experimental, feature the pre-built packages are not setup to use it by default. Open the file /etc/singularity/singularity.conf in an editor (e.g. nano /etc/singularity/singularity.conf) and find the line that begins nvidia-container-cli path =. Change it to read:
nvidia-container-cli path = /usr/bin/nvidia-container-cli

If you compile SingularityCE from source and nvidia-container-cli is installed and on your PATH, it will have been found automatically and noted in the ./mconfig output. You do not need to edit singularity.conf in this case.

Download a TensorFlow Container

Let’s fetch the latest GPU enabled version of TensorFlow from Docker Hub, using singularity pull, so that we have a SIF file in our current directory. This can take a minute or two on a slower internet connection as the container image is very large.

$ singularity pull docker://tensorflow/tensorflow:latest-gpu
INFO:    Converting OCI blobs to SIF format
INFO:    Starting build...
Getting image source signatures
Copying blob 3196a0117ed3 done
...
INFO:    Creating SIF file...

Use TensorFlow with a GPU

First, let’s check tensorflow can see our GPU. Start a TensorFlow Python session by running the container with the --nv --nvccli flags to use the new GPU setup method:

$ singularity run --nv --nvccli tensorflow_latest-gpu.sif
INFO:    Setting 'NVIDIA_VISIBLE_DEVICES=all' to emulate legacy GPU binding.
INFO:    Setting --writable-tmpfs (required by nvidia-container-cli)

________                               _______________
___  __/__________________________________  ____/__  /________      __
__  /  _  _ \_  __ \_  ___/  __ \_  ___/_  /_   __  /_  __ \_ | /| / /
_  /   /  __/  / / /(__  )/ /_/ /  /   _  __/   _  / / /_/ /_ |/ |/ /
/_/    \___//_/ /_//____/ \____//_/    /_/      /_/  \____/____/|__/


You are running this container as user with ID 1000 and group 1000,
which should map to the ID and group for your user on the Docker host. Great!

Singularity> python
Python 3.8.10 (default, Nov 26 2021, 20:14:08)
[GCC 9.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.

Import the tensorflow package, and perform a device query:

>>> import tensorflow as tf
>>> tf.config.list_physical_devices('GPU')
2022-03-25 11:42:25.672088: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:922] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.
2022-03-25 11:42:25.713295: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:922] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.
2022-03-25 11:42:25.713892: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:922] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.
[PhysicalDevice(name='/physical_device:GPU:0', device_type='GPU')]

You can safely ignore the warnings about NUMA support. This is a side effect of the WSL2 session, and won’t harm our ability to use tensorflow.

Now we’ll create and train a model, based on the TensorFlow beginner quickstart.

Load some data, and create a keras model:

>>> mnist = tf.keras.datasets.mnist
>>> (x_train, y_train), (x_test, y_test) = mnist.load_data()
>>> x_train, x_test = x_train / 255.0, x_test / 255.0
>>> model = tf.keras.models.Sequential([
  tf.keras.layers.Flatten(input_shape=(28, 28)),
  tf.keras.layers.Dense(128, activation='relu'),
  tf.keras.layers.Dropout(0.2),
  tf.keras.layers.Dense(10)
])

The model= line should output information about the GPU, and on the test system it shows a RTX 3050 Ti:

2022-03-25 11:48:17.283284: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1525] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 1619 MB memory:  -> device: 0, name: NVIDIA GeForce RTX 3050 Ti Laptop GPU, pci bus id: 0000:01:00.0, compute capability: 8.6

Prepare the model some more:

>>> loss_fn = tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True)
>>> model.compile(optimizer='adam',
              loss=loss_fn,
              metrics=['accuracy'])

Now we can run the training task…

>>> model.fit(x_train, y_train, epochs=5)
Epoch 1/5
2022-03-25 11:52:10.345591: I tensorflow/stream_executor/cuda/cuda_blas.cc:1786] TensorFloat-32 will be used for the matrix multiplication. This will only be logged once.
1875/1875 [==============================] - 5s 2ms/step - loss: 2.4709 - accuracy: 0.7418
Epoch 2/5
1843/1875 [============================>.] - ETA: 0s - loss: 0.5877 - accuracy: 0.8408

We can see above that the training is successfully using our GPU via the cuda_blas library.

Summary

That’s all there is to it! Singularity containers for CUDA applications can now be developed, tested, and used on a Windows laptop or desktop. All of the standard Singularity features work well under WSL2, so it’s a really powerful development environment. When you need to run on more powerful GPU nodes, just take your SIF file to your HPC environment.

If you don’t want to have to remember to use the --nv and --nvccli flags for each GPU container you run in WSL2 you can set always use nv and use nvidia-container-cli to yes in your singularity.conf file.

In future versions of SingularityCE and SingularityPRO we’ll be aiming to make the --nvccli method of GPU setup the default, simplifying this process further.

Let us know via the Singularity community spaces if you have questions, comments, or hit any trouble.

Related Posts