MTL platform with ARC 770 cannot allocate memory block with size lager than 4GB when running vLLM Qwen2-VL-2B #12136

weijiejx · 2024-09-27T07:47:59Z

when I run vLLM model like Qwen2-VL-2B with ARC770 on MTL platform, will report error message as below:
RuntimeError: Current platform can NOT allocate memory block with size larger than 4GB! Tried to allocate 6.10 GiB (GPU 0; 15.11 GiB total capacity; 4.84 GiB already allocated; 5.41 GiB reserved in total by PyTorch)

hzjane · 2024-09-29T02:12:36Z

Vllm 0.5.4 does not support qwen2-vl model yet. We will support it in the future 0.6.1 version.

weijiejx · 2024-09-29T02:25:40Z

Thank you! But I need double confirm, I use ipex to run Qwen2-VL-2B, not OpenVINO, vLLM 0.5.4 not support, right?

hzjane · 2024-09-29T02:30:09Z

Yes, even the official version of vllm 0.5.4 does not support it until 0.6.1.

weijiejx · 2024-09-29T07:21:18Z

Thanks again.
One more question, is any vLLM model available that I can use with vllm 0.5.4? Can you advise me one or two that I can try it.
Thanks.

hzjane · 2024-09-29T08:34:29Z

It is recommended to run Llama Qwen and chatglm models.
for example: Llama-2-7b-chat-hf Qwen1.5-7B-Chat chatglm3-6b.

weijiejx · 2024-10-10T02:33:02Z

Hi，
Could you kindly point out what schedule of 0.6.1 for release if you know it, thanks.

glorysdj · 2024-10-10T05:51:58Z

We are validating 0.6.2 version, and Qwen2-VL model, will notify you once it's ready. Thanks.

glorysdj assigned hzjane Sep 29, 2024

glorysdj added the user issue label Sep 29, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

MTL platform with ARC 770 cannot allocate memory block with size lager than 4GB when running vLLM Qwen2-VL-2B #12136

MTL platform with ARC 770 cannot allocate memory block with size lager than 4GB when running vLLM Qwen2-VL-2B #12136

weijiejx commented Sep 27, 2024 •

edited

Loading

hzjane commented Sep 29, 2024 •

edited

Loading

weijiejx commented Sep 29, 2024

hzjane commented Sep 29, 2024

weijiejx commented Sep 29, 2024

hzjane commented Sep 29, 2024

weijiejx commented Oct 10, 2024

glorysdj commented Oct 10, 2024

MTL platform with ARC 770 cannot allocate memory block with size lager than 4GB when running vLLM Qwen2-VL-2B #12136

MTL platform with ARC 770 cannot allocate memory block with size lager than 4GB when running vLLM Qwen2-VL-2B #12136

Comments

weijiejx commented Sep 27, 2024 • edited Loading

hzjane commented Sep 29, 2024 • edited Loading

weijiejx commented Sep 29, 2024

hzjane commented Sep 29, 2024

weijiejx commented Sep 29, 2024

hzjane commented Sep 29, 2024

weijiejx commented Oct 10, 2024

glorysdj commented Oct 10, 2024

weijiejx commented Sep 27, 2024 •

edited

Loading

hzjane commented Sep 29, 2024 •

edited

Loading