Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Docker run fails: docker: Error response from daemon: could not select device driver "" with capabilities: [[gpu]] #93

Open
hapasa opened this issue Dec 2, 2023 · 3 comments

Comments

@hapasa
Copy link

hapasa commented Dec 2, 2023

Background, trying to test if my machine is stable with any NVidia graphics card before upgrading to 4060 Ti 16Gb.
My old NVidia 1050 Ti "kept falling of the bus" according to dmesg, no testing with even older card.

Ubuntu 22.04 5.15.0-89-generic
docker: Error response from daemon: could not select device driver "" with capabilities: [[gpu]]

The build went seemingly fine:

$ docker build -t gpu_burn .
DEPRECATED: The legacy builder is deprecated and will be removed in a future release.
            Install the buildx component to build images with BuildKit:
            https://docs.docker.com/go/buildx/

Sending build context to Docker daemon    190kB
Step 1/11 : ARG CUDA_VERSION=11.8.0
Step 2/11 : ARG IMAGE_DISTRO=ubi8
Step 3/11 : FROM nvidia/cuda:${CUDA_VERSION}-devel-${IMAGE_DISTRO} AS builder
11.8.0-devel-ubi8: Pulling from nvidia/cuda
94343313ec15: Pull complete 
9fb272588c1d: Pull complete 
b9797304348b: Pull complete 
5e33c7d9d941: Pull complete 
2e545d869d81: Pull complete 
3b6f4fdd4835: Pull complete 
186b2cf099be: Pull complete 
bb9948097bcc: Pull complete 
665cacaea78b: Pull complete 
a8b41fa5efb1: Pull complete 
Digest: sha256:07f78c377ad928da58a9da192a4ca978c4050b53c66f6df9461d20cba80db990
Status: Downloaded newer image for nvidia/cuda:11.8.0-devel-ubi8
 ---> 6d4df348e537
Step 4/11 : WORKDIR /build
 ---> Running in c01d88a45cf5
Removing intermediate container c01d88a45cf5
 ---> b3823e8592df
Step 5/11 : COPY . /build/
 ---> 12ed4440781c
Step 6/11 : RUN make
 ---> Running in 98b35750f935
g++  -O3 -Wno-unused-result -I/usr/local/cuda/include -std=c++11 -c gpu_burn-drv.cpp
PATH="/usr/local/nvidia/bin:/usr/local/cuda/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin::." /usr/local/cuda/bin/nvcc  -I/usr/local/cuda/include -arch=compute_50 -ptx compare.cu -o compare.ptx
g++ -o gpu_burn gpu_burn-drv.o -O3  -lcuda -L/usr/local/cuda/lib64 -L/usr/local/cuda/lib64/stubs -L/usr/local/cuda/lib -L/usr/local/cuda/lib/stubs -Wl,-rpath=/usr/local/cuda/lib64 -Wl,-rpath=/usr/local/cuda/lib -lcublas -lcudart
Removing intermediate container 98b35750f935
 ---> 0199671c5b9d
Step 7/11 : FROM nvidia/cuda:${CUDA_VERSION}-runtime-${IMAGE_DISTRO}
11.8.0-runtime-ubi8: Pulling from nvidia/cuda
94343313ec15: Already exists 
9fb272588c1d: Already exists 
b9797304348b: Already exists 
5e33c7d9d941: Already exists 
2e545d869d81: Already exists 
3b6f4fdd4835: Already exists 
186b2cf099be: Already exists 
bb9948097bcc: Already exists 
665cacaea78b: Already exists 
Digest: sha256:b3a3629fd70a0af16e895a832a85d3c54b62d367d2d9a695a0e9b34a74627183
Status: Downloaded newer image for nvidia/cuda:11.8.0-runtime-ubi8
 ---> f2b81eaaed01
Step 8/11 : COPY --from=builder /build/gpu_burn /app/
 ---> 5ded3ba95d1e
Step 9/11 : COPY --from=builder /build/compare.ptx /app/
 ---> d43d5fe967ce
Step 10/11 : WORKDIR /app
 ---> Running in 790d35a87e2b
Removing intermediate container 790d35a87e2b
 ---> 55179c022189
Step 11/11 : CMD ["./gpu_burn", "60"]
 ---> Running in cac5f5969349
Removing intermediate container cac5f5969349
 ---> 0b882c79e890
Successfully built 0b882c79e890
Successfully tagged gpu_burn:latest

$ nvidia-smi
Sat Dec  2 12:50:42 2023       
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 545.23.08              Driver Version: 545.23.08    CUDA Version: 12.3     |
|-----------------------------------------+----------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |         Memory-Usage | GPU-Util  Compute M. |
|                                         |                      |               MIG M. |
|=========================================+======================+======================|
|   0  NVIDIA GeForce GTX 750 Ti      On  | 00000000:09:00.0  On |                  N/A |
| 33%   32C    P8               1W /  46W |    451MiB /  2048MiB |      1%      Default |
|                                         |                      |                  N/A |
+-----------------------------------------+----------------------+----------------------+
                                                                                         
+---------------------------------------------------------------------------------------+
| Processes:                                                                            |
|  GPU   GI   CI        PID   Type   Process name                            GPU Memory |
|        ID   ID                                                             Usage      |
|=======================================================================================|
|    0   N/A  N/A      1509      G   /usr/lib/xorg/Xorg                          205MiB |
|    0   N/A  N/A      2564      G   /usr/bin/kwin_x11                            48MiB |
|    0   N/A  N/A      2628      G   /usr/bin/plasmashell                         74MiB |
|    0   N/A  N/A      3017      G   /usr/bin/plasma-discover                     22MiB |
|    0   N/A  N/A     27992      G   ...0424349,12935558332982127916,262144       90MiB |
+---------------------------------------------------------------------------------------+

@hapasa
Copy link
Author

hapasa commented Dec 2, 2023

Note that I was able to pull the repo and build gpu_burn successfully. It is now running fine.
So problem maybe just with something related to Docker + permissions?

@yankee14
Copy link

I have the same issue

@tt2468
Copy link

tt2468 commented Mar 29, 2024

It looks like the README is missing some prerequisites in terms of what docker needs in order to run with GPUs: https://stackoverflow.com/a/58432877

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants