Building Docker Images with NVIDIA Runtime

Photo by Kaique Rocha on Pexels.com

Containerizing Machine Learning(ML) deployables is a common practice in the real world of AI, which is extensively done using Docker. This process starts by defining a Dockerfile which is then used to build Docker images and eventually these images are run as independent containers to deliver specific features. Given the prevalent use of NVIDIA GPUs when training/serving ML models, the NVIDIA Container Toolkit allows developers to build and run containers that leverage the underlying hardware resources. Often referred to as the ‘NVIDIA runtime’ environment, the usage of GPU resources for each container can be controlled by supplying environment variables like —-gpus or -e NVIDIA_VISIBLE_DEVICES. While these options are provided when a container is run, there are times when the NVIDIA runtime environment is required when building images that require CUDA libraries or GPU resources. Such build steps may include installing packages that use CUDA libraries or running tests using ML models. As for my personal experience, it was when installing the onnx-tensorrt python package that was used to serve an object detection model.

Here in this article, I wanted to share two ways in which Docker images can be built using the NVIDIA runtime environment:

Configure your Docker Daemon default-runtime to use nvidia
Include CUDA library stubs in LD_LIBRARY_PATH

Before getting into the details, the underlying assumption for these solutions is that the NVIDIA Container toolkit has been correctly installed on your machine.

1. Configure your Docker Daemon default-runtime to use ‘nvidia’

Modifying the default runtime of the host’s Docker daemon can be done with a few modifications to the daemon.json file. Usually located under the directory /etc/docker/; however, requires super-user privileges to make changes to the file. Within the file, the nvidia runtime configuration will be defined, and setting the default-runtime property to be nvidia (as shown below) will allow subsequent Docker builds to use the nvidia runtime by default.

# /etc/docker/daemon.json{  “default-runtime”: “nvidia”, # THIS LINE  “runtimes”: {      “nvidia”: {        “path”: “/usr/bin/nvidia-container-runtime”,        “runtimeArgs”: []      }  },}

Once the changes are in place, a restart of the Docker daemon is required for the new changes to take effect and subsequent docker builds will use the nvidia runtime by default.

$ sudo systemctl restart docker

This information can also be found in the NVIDIA documentation, as referenced in this section, and there is a word of caution when using this method. It states that it is important to specify the GPU architecture that the container will need because failure to do so will result in an image being optimized for the architecture of the machine, rather than the application. Examples of GPU architectures include Tesla P40, GeForce GT 650M and can be found by using the nvidia-smi command.

References:

2. Include CUDA library stubs in LD_LIBRARY_PATH

The LD_LIBRARY_PATH is called the ‘shared library search path’ which stores a list of directories where executables can search for Linux shared libraries. By appending to this environment variable, this approach gives more visibility to the required packages within the Dockerfile. Firstly, it is highly recommended to use a Docker base image that has CUDA installed ( i.e nvidia/cuda:11.6.0-devel-ubuntu20.04). As low-level CUDA Driver API libraries like libcuda.so are required to access GPU resources, adding these files to the LD_LIBRARY_PATH will allow various programs to use CUDA-related libraries during the Docker build. These files can be found usually under the directory /usr/local/cuda/lib64/stubs/ and the path can be appended by including the following line in your Dockerfile:

ENV LD_LIBRARY_PATH /usr/local/cuda/lib64/stubs/:$LD_LIBRARY_PATH

References:

So, which is better?

Both approaches have their pros & cons and depending on how much control you have in your CI pipeline, your choice might be different. Although the first option might seem like an easy change, modifying the daemon.json file requires you to have root/admin privileges (and a restart which will affect other containers running on the specific machine). In some cases, developers might not even have access to the host that executes the docker build. Additionally, other Docker image builds triggered with this configuration will be run with this configuration and may result in unexpected behavior.

On the other hand, adding the CUDA library stubs doesn’t require configuration changes to the Docker daemon, but does require careful management of the shared libraries. Conflicting files may result in unknown issues which may affect other areas of the application and cause a debugging nightmare.

If both approaches do not seem viable, depending on your use case, it might be simpler to include the build step as part of an ENTRYPOINT and have it run before running the docker run CMD.

Tech.Zealot