How much GPU memory can SigLIP save compared with CLIP? #872

Pluto-Jin · 2024-02-22T11:18:24Z

Pluto-Jin
Feb 22, 2024

Hello,

The original SigLIP paper said they can fit 2x batch size on TPU with base SigLIP model, compared with CLIP.

But in my experiment, I both used 14400 batch size on 48 A100-40GB, while the SigLIP and CLIP models are both base-sized standard structure. Then during the training, SigLIP takes 33.5G while CLIP takes 37.0G on each GPU. They are close and I couldn't scale up 2x batch size as the paper said.

I am not using any FSDP/deepspeed techniques, is it the reason? Or does the GPU type matter a lot? I have no idea.

Can anyone who ever trained a SigLIP model share your experience?

Thanks!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How much GPU memory can SigLIP save compared with CLIP? #872

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 0 comments

Select a reply

How much GPU memory can SigLIP save compared with CLIP? #872

Pluto-Jin Feb 22, 2024

Replies: 0 comments

Pluto-Jin
Feb 22, 2024