How to Ensure TensorFlow GPU CUDA Compatibility When Setting Up a New Deep Learning Machine

Question

Master TensorFlow GPU CUDA compatibility for deep learning. Synchronize NVIDIA driver, CUDA, cuDNN, and TensorFlow versions. Verify GPU detection with tf.test.is_gpu_available().

Accepted Answer

To guarantee TensorFlow GPU CUDA compatibility, you must synchronize the NVIDIA driver, CUDA toolkit, cuDNN library, and TensorFlow binary versions according to the official build constraints, then verify detection using tf.test.is_gpu_available().

Setting up a fresh workstation for deep learning requires precise alignment between hardware drivers and software libraries to avoid silent CPU fallback or runtime crashes. This guide establishes tensorflow gpu cuda compatibility using the official tensorflow/tensorflow repository as the canonical reference for build configurations, environment variables, and validation procedures.

Verify Your NVIDIA Driver Version

Before installing any CUDA libraries, confirm your NVIDIA driver supports your target CUDA version. For example, CUDA 12.2 requires driver version 525 or newer. Run the following command to inspect the current driver and its reported CUDA compatibility:

nvidia-smi

According to the tensorflow/tools/dockerfiles/Dockerfile.cuda, the official TensorFlow GPU containers install specific driver branches to ensure runtime stability, making this file a reliable reference for driver version matrices.

Install Matching CUDA and cuDNN Libraries

TensorFlow binaries are compiled against specific CUDA and cuDNN major and minor versions. Mismatched libraries—such as CUDA 11.8 against a TensorFlow wheel built for CUDA 12.2—cause fatal runtime errors like "Unable to load cuDNN". For TensorFlow 2.15, install CUDA 12.2 and cuDNN 8.9.

The exact version constraints are baked into the build rules in tensorflow/tools/pip_package/BUILD, which defines the CUDA and cuDNN versions used during the wheel compilation.

Install the runtime libraries using the official NVIDIA installers:


# Download and install CUDA 12.2

wget https://developer.download.nvidia.com/compute/cuda/12.2.0/local_installers/cuda_12.2.0_525.60.13_linux.run
sudo sh cuda_12.2.0_525.60.13_linux.run --silent --toolkit

# Download cuDNN 8.9 from NVIDIA Developer portal and extract

tar -xzvf cudnn-linux-x86_64-8.9.0.131_cuda12-archive.tar.gz
sudo cp -P cudnn-*-archive/include/cudnn*.h /usr/local/cuda/include/
sudo cp -P cudnn-*-archive/lib/libcudnn* /usr/local/cuda/lib64/
sudo chmod a+r /usr/local/cuda/include/cudnn*.h /usr/local/cuda/lib64/libcudnn*

Configure Environment Variables

Set CUDA_HOME to point to your CUDA installation directory and update LD_LIBRARY_PATH so the dynamic linker can locate libcudart.so and cuDNN libraries. Enable TF_FORCE_GPU_ALLOW_GROWTH=true to prevent TensorFlow from allocating all GPU memory at startup, which is implemented in tensorflow/python/framework/config_ops.py via the memory growth configuration API.

Export these variables in your shell profile or activation script:

export CUDA_HOME=/usr/local/cuda-12.2
export LD_LIBRARY_PATH=$CUDA_HOME/lib64:$LD_LIBRARY_PATH
export TF_FORCE_GPU_ALLOW_GROWTH=true

Select the Correct TensorFlow Package

For most users, install the pre-built universal wheel that includes GPU support for CUDA 12.x:

pip install "tensorflow==2.15.*"

These wheels are compiled with specific CUDA/cuDNN versions as defined in the build configuration. If you require TensorFlow for a non-standard CUDA version or need static linking, build from source using the Bazel flags demonstrated in tensorflow/tools/ci_build/builds/dockerfiles/linux_gpu/Dockerfile. This Dockerfile shows the --config=cuda build process used in CI pipelines.

Pin CUDA Compute Capabilities for Older GPUs

If you are building from source to support older GPU architectures (e.g., Compute Capability 3.5 or 5.2), set the TF_CUDA_COMPUTE_CAPABILITIES environment variable before compiling. This flag is defined in the .bazelrc file and controls which GPU architectures are included in the binary. Missing compute capabilities cause "No kernels were compiled for this GPU" errors at runtime.

export TF_CUDA_COMPUTE_CAPABILITIES="3.5,5.2,6.1,7.0,8.0"

Validate the Installation

Verify that TensorFlow can detect and utilize the GPU by running a Python check. Use tf.test.is_gpu_available() to confirm the CUDA driver loads correctly, and tf.config.list_physical_devices('GPU') to enumerate accessible devices. These validation utilities are implemented in tensorflow/python/platform/test.py.

import tensorflow as tf

print("TensorFlow version:", tf.__version__)
print("GPU available:", tf.test.is_gpu_available(cuda_only=True))

# Detailed device enumeration

gpus = tf.config.list_physical_devices('GPU')
for gpu in gpus:
    print("Detected GPU:", gpu)

Lock Your Environment for Reproducibility

Document the exact driver, CUDA, cuDNN, and TensorFlow versions in a requirements.txt or Dockerfile to ensure collaborators can rebuild an identical environment. The tensorflow/tools/dockerfiles/Dockerfile.cuda serves as the canonical reference for compatible version matrices and installation sequences.

Summary

Verify the driver: Run nvidia-smi to confirm driver version 525+ for CUDA 12.2 compatibility as referenced in tensorflow/tools/dockerfiles/Dockerfile.cuda.
Align CUDA and cuDNN: Install CUDA 12.2 and cuDNN 8.9 for TensorFlow 2.15, matching the constraints in tensorflow/tools/pip_package/BUILD.
Set environment paths: Configure CUDA_HOME, LD_LIBRARY_PATH, and TF_FORCE_GPU_ALLOW_GROWTH to enable runtime library discovery and memory management handled by tensorflow/python/framework/config_ops.py.
Install matching binaries: Use pip install "tensorflow==2.15.*" for standard CUDA 12.x support, or build from source using tensorflow/tools/ci_build/builds/dockerfiles/linux_gpu/Dockerfile for custom versions.
Validate with code: Execute tf.test.is_gpu_available() from tensorflow/python/platform/test.py to confirm GPU detection.
Pin compute capabilities: Set TF_CUDA_COMPUTE_CAPABILITIES in .bazelrc when compiling for older GPU architectures.

Frequently Asked Questions

How do I check if TensorFlow is using my GPU?

Run tf.test.is_gpu_available(cuda_only=True) to verify that TensorFlow successfully loaded the CUDA driver and detected a compatible device. Additionally, tf.config.list_physical_devices('GPU') returns a list of GPU devices TensorFlow can access. If these return False or empty lists, check your LD_LIBRARY_PATH and CUDA installation.

What CUDA version does TensorFlow 2.15 require?

TensorFlow 2.15 requires CUDA 12.2 and cuDNN 8.9. These specific versions are hardcoded into the pre-built wheel compilation defined in tensorflow/tools/pip_package/BUILD. Using CUDA 11.8 or cuDNN 8.6 will result in library loading errors.

Can I build TensorFlow for an older GPU architecture?

Yes. Before compiling from source, set the TF_CUDA_COMPUTE_CAPABILITIES environment variable to include your GPU's compute capability (e.g., "3.5,5.2,6.1"). This setting, controlled via .bazelrc, ensures TensorFlow includes kernels for older architectures like Kepler or Maxwell. Without this, TensorFlow will fail with "No kernels were compiled for this GPU" on older hardware.

Why does TensorFlow fail to load the GPU library despite correct installation?

This typically occurs when LD_LIBRARY_PATH does not include the CUDA and cuDNN library directories, or when the NVIDIA driver is older than the CUDA toolkit requires. Ensure CUDA_HOME points to the correct installation path (e.g., /usr/local/cuda-12.2) and that nvidia-smi reports a driver version compatible with your CUDA version, as specified in tensorflow/tools/dockerfiles/Dockerfile.cuda.