vLLM failing to recognize GPU from latest official docker image #5661
Replies: 3 comments
-
I was in the same situation, got distracted with other stuff the past 4-6 months and looping back now. Latest is also giving me the same error. Did you ever resolve this? |
Beta Was this translation helpful? Give feedback.
-
I've actually tried going back to older images (5.x, 4.x, 3.x, 2.x) and get the same crash. nvidia-smi shows correctly as follows: `C:\AI-Content\vllm-docker>nvidia-smi +-----------------------------------------------------------------------------------------+ |
Beta Was this translation helpful? Give feedback.
-
FYI I was able to fix this. Turns out that my Docker Desktop itself was out of date and installing the latest update fixed this error. Was likely due to too new a video driver with an older Docker Desktop. |
Beta Was this translation helpful? Give feedback.
-
Hey folks,
I figured I'd post this here before opening a ticket to see if anyone has encountered this and its just a configuration issue on my part. I have been running vLLM's latest, officially published, docker image within my project up until about a couple months back. I got busy with other stuff and recently came back to try and work on the same project. When I run the latest image now it throws the following error:
2024-06-18 21:24:16 return torch._C._cuda_getDeviceCount() > 0 2024-06-18 21:24:16 Traceback (most recent call last): 2024-06-18 21:24:16 File "/usr/lib/python3.10/runpy.py", line 196, in _run_module_as_main 2024-06-18 21:24:16 return _run_code(code, main_globals, None, 2024-06-18 21:24:16 File "/usr/lib/python3.10/runpy.py", line 86, in _run_code 2024-06-18 21:24:16 exec(code, run_globals) 2024-06-18 21:24:16 File "/workspace/vllm/entrypoints/openai/api_server.py", line 236, in <module> 2024-06-18 21:24:16 engine = AsyncLLMEngine.from_engine_args(engine_args) 2024-06-18 21:24:16 File "/workspace/vllm/engine/async_llm_engine.py", line 622, in from_engine_args 2024-06-18 21:24:16 engine_configs = engine_args.create_engine_configs() 2024-06-18 21:24:16 File "/workspace/vllm/engine/arg_utils.py", line 286, in create_engine_configs 2024-06-18 21:24:16 device_config = DeviceConfig(self.device) 2024-06-18 21:24:16 File "/workspace/vllm/config.py", line 496, in __init__ 2024-06-18 21:24:16 raise RuntimeError("No supported device detected.") 2024-06-18 21:24:16 RuntimeError: No supported device detected.
Not sure if something has changed in the base image or a dependent library but I couldn't find anything recent related to this error in the docs, issues, or discussions. Any insight would be very much appreciated.
Docker compose:
vllm: image: vllm/vllm-openai:latest command: ["--model", "TheBloke/Mistral-7B-Instruct-v0.2-code-ft-GPTQ", "--quantization", "gptq", "--dtype", "float16", "--revision", "gptq-4bit-32g-actorder_True", "--tokenizer", "TheBloke/Mistral-7B-Instruct-v0.2-code-ft-GPTQ", "--tokenizer-revision", "gptq-4bit-32g-actorder_True", "--max-num-batched-tokens", "6000", "--max-model-len", "6000"] restart: always shm_size: '32gb' ports: - "9002:8000" networks: - portfoliotracker_network deploy: resources: reservations: devices: - driver: nvidia count: all capabilities: [gpu]
Beta Was this translation helpful? Give feedback.
All reactions