Loading over multiple gpus in 8bit and 4bit with transformers loader #5

RandomInternetPreson · 2024-03-28T02:36:18Z

I can load the instruct model using the transformers loader and 8bit bits and bytes, I can get it to load evenly among multiple gpus.

However, I cannot seem to load the model with 4bit precion over multiple gpus, I managed to get the model to load across 1 24GB gpu and then start loading onto a second gpu of equivalent size, but it will not move on to any of the remaining gpus (7 in total). It will oom on the second gpu with the others sitting empty.

I've loaded other transformers based models via 4bit and never experience this heavily unbalanced loading before.

jzwilliams07 · 2024-03-28T06:55:53Z

how to load the model in 8bit?

lhl · 2024-03-28T11:16:00Z

I have this same issue, get an OOM/only uses a single GPU if I try to use bitsandbytes (load_in_8bit or load_in_4bit)...

huhuhu5798 · 2024-04-10T08:05:34Z

how to load the model in 8bit or 4bit？？？

RandomInternetPreson · 2024-04-10T12:34:10Z

Bitsandbytes library and a lot of ram

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Loading over multiple gpus in 8bit and 4bit with transformers loader #5

Loading over multiple gpus in 8bit and 4bit with transformers loader #5

RandomInternetPreson commented Mar 28, 2024

jzwilliams07 commented Mar 28, 2024

lhl commented Mar 28, 2024

huhuhu5798 commented Apr 10, 2024

RandomInternetPreson commented Apr 10, 2024

Loading over multiple gpus in 8bit and 4bit with transformers loader #5

Loading over multiple gpus in 8bit and 4bit with transformers loader #5

Comments

RandomInternetPreson commented Mar 28, 2024

jzwilliams07 commented Mar 28, 2024

lhl commented Mar 28, 2024

huhuhu5798 commented Apr 10, 2024

RandomInternetPreson commented Apr 10, 2024