Do you use LLaVA AnyRes? #27

kimwongyuda · 2024-12-27T03:15:07Z

Thank you for your nice work.

When I try to run the code, just 1 batch size per device is allowed, even though I use llava with mistral and also lora (but grad cache not used).
It is suspected that a lot of image tokens from AnyRes of llava occupy gpus too much.

I didn't modify any code of your work.
How can I increase batch size? & Did you use AnyRes like above?

Thank you.

XMHZZ2018 · 2024-12-27T03:40:04Z

@kimwongyuda

Thank you for your interest in our work! You are correct that image tokens can significantly consume GPU memory, limiting the per-device batch size to around 2 to 4 on devices like the H100. If your GPU has less memory, it’s expected that only a per device batch size of 1 might be feasible.

However, the actual batch size is not the same as the per-device batch size. We utilize the GradCache technique to scale the actual batch size to 2K or even larger. "--grad_cache True" will enable GradCache. (The README contains the whole commands.) Please let me know whether it works.

kimwongyuda · 2024-12-27T11:42:43Z

Without GradCache, limitation of batch size is 2~4 as you said above.

Then, how did you use batch size 256 in Table 3 of the paper on 8 * H100? The environment of table 3 experiment doesn't look like using GradCache (basic setting).

Is it from model size difference between phi3.5v and llava_next? or Did you use gradient accumulation?

Thank you.

XMHZZ2018 · 2024-12-27T13:35:34Z

Hi @kimwongyuda , all the experiments are using GradCache to scale up the batch size. (It's a default settings.)

kimwongyuda · 2024-12-28T03:53:20Z

@XMHZZ2018
Sorry. I misunderstood. Thank you so much for your response.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Do you use LLaVA AnyRes? #27

Do you use LLaVA AnyRes? #27

kimwongyuda commented Dec 27, 2024

XMHZZ2018 commented Dec 27, 2024

kimwongyuda commented Dec 27, 2024 •

edited

Loading

XMHZZ2018 commented Dec 27, 2024

kimwongyuda commented Dec 28, 2024

Do you use LLaVA AnyRes? #27

Do you use LLaVA AnyRes? #27

Comments

kimwongyuda commented Dec 27, 2024

XMHZZ2018 commented Dec 27, 2024

kimwongyuda commented Dec 27, 2024 • edited Loading

XMHZZ2018 commented Dec 27, 2024

kimwongyuda commented Dec 28, 2024

kimwongyuda commented Dec 27, 2024 •

edited

Loading