pypipytorch95% confidence\u2191 63

CUDA initialization: CUDA unknown error - this may be due to an incorrectly set up environment

Full error message

I am trying to install torch with CUDA support.

Here is the result of my collect_env.py script:

PyTorch version: 1.7.1+cu101
Is debug build: False
CUDA used to build PyTorch: 10.1
ROCM used to build PyTorch: N/A

OS: Ubuntu 20.04.1 LTS (x86_64)
GCC version: (Ubuntu 9.3.0-17ubuntu1~20.04) 9.3.0
Clang version: Could not collect
CMake version: Could not collect

Python version: 3.9 (64-bit runtime)
Is CUDA available: False
CUDA runtime version: 10.1.243
GPU models and configuration: GPU 0: GeForce GTX 1080
Nvidia driver version: 460.39
cuDNN version: Could not collect
HIP runtime version: N/A
MIOpen runtime version: N/A

Versions of relevant libraries:
[pip3] numpy==1.19.2
[pip3] torch==1.7.1+cu101
[pip3] torchaudio==0.7.2
[pip3] torchvision==0.8.2+cu101
[conda] blas                      1.0                         mkl  
[conda] cudatoolkit               10.1.243             h6bb024c_0  
[conda] mkl                       2020.2                      256  
[conda] mkl-service               2.3.0            py39he8ac12f_0  
[conda] mkl_fft                   1.3.0            py39h54f3939_0  
[conda] mkl_random                1.0.2            py39h63df603_0  
[conda] numpy                     1.19.2           py39h89c1606_0  
[conda] numpy-base                1.19.2           py39h2ae0177_0  
[conda] torch                     1.7.1+cu101              pypi_0    pypi
[conda] torchaudio                0.7.2                    pypi_0    pypi
[conda] torchvision               0.8.2+cu101              pypi_0    pypi

Process finished with exit code 0

Here is the output of nvcc - V

nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2019 NVIDIA Corporation
Built on Sun_Jul_28_19:07:16_PDT_2019
Cuda compilation tools, release 10.1, V10.1.243

Finally, here is the output of nvidia-smi

+-----------------------------------------------------------------------------+
| NVIDIA-SMI 460.39       Driver Version: 460.39       CUDA Version: 11.2     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  GeForce GTX 1080    Off  | 00000000:01:00.0  On |                  N/A |
|  0%   52C    P0    46W / 180W |    624MiB /  8116MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|    0   N/A  N/A       873      G   /usr/lib/xorg/Xorg                101MiB |
|    0   N/A  N/A      1407      G   /usr/lib/xorg/Xorg                419MiB |
|    0   N/A  N/A      2029      G   ...AAAAAAAAA= --shared-files       90MiB |
+-----------------------------------------------------------------------------+

However, when I try to run

print(torch.cuda.is_available())

I get the following error:

UserWarning: CUDA initialization: CUDA unknown error - this may be due to an incorrectly set up environment, e.g. changing env variable CUDA_VISIBLE_DEVICES after program start. Setting the available devices to be zero. (Triggered internally at  /pytorch/c10/cuda/CUDAFunctions.cpp:100.)
  return torch._C._cuda_getDeviceCount() > 0

I have performed a reboot, and I have followed the post-installation steps as detailed in here

Solutionsource: stackoverflow \u2197

Had the same issue and in my case solution was very easy, however it wasn't easy to find it. I had to remove and insert nvidia_uvm module. So: > sudo rmmod nvidia_uvm > sudo modprobe nvidia_uvm That's all. Just before these command collect_env.py reported "Is CUDA available: False". After: "Is CUDA available: True"

API access

Get this solution programmatically \u2014 free, no authentication.

curl https://depscope.dev/api/error/6a82f1208071e1cef5a23bb76e8f5a1e483f5e2662edd63a042c13a10159944d

hash \u00b7 6a82f1208071e1cef5a23bb76e8f5a1e483f5e2662edd63a042c13a10159944d