Running model on multiple GPUs? How to do it? Does vllm allows it do easily? #6500

martinenkoEduard · 2024-07-17T07:50:14Z

martinenkoEduard
Jul 17, 2024

Running model on multiple GPUs? How to do it?
Can you show simple example?

What are the restrictions? Should GPUs be identical?
Or is it possible for instance to have
one RTX 3070 and one 3080? What about memory sharing?
Do mistal and llama support these features?

Do I need to install special drivers or special pytroch versions?

Also can you share a rig configuration with multiple GPUs for local LLM deployment?

jiangchengchengark · 2024-07-19T15:04:47Z

jiangchengchengark
Jul 19, 2024

you can insert this value : --tensor-parallel-size 2

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Running model on multiple GPUs? How to do it? Does vllm allows it do easily? #6500

{{title}}

Replies: 1 comment

{{title}}

Select a reply

Running model on multiple GPUs? How to do it? Does vllm allows it do easily? #6500

martinenkoEduard Jul 17, 2024

Replies: 1 comment

jiangchengchengark Jul 19, 2024

martinenkoEduard
Jul 17, 2024

jiangchengchengark
Jul 19, 2024