Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RADIOv2.1 still much better than v2.5 with image size resolution 700px ? #86

Open
tcourat opened this issue Sep 3, 2024 · 3 comments
Open

Comments

@tcourat
Copy link

tcourat commented Sep 3, 2024

Hi

It seems that RADIOv2.1 is still much better than v2.5 for images with a resolution smaller than 700px, according to your technical report. What caused this ? Is this only due to having a smaller architecture (L<H). In this case do you plan to release a H version too or do you advice to keep using v2.1 for "small" images ?

Thanks

@mranzinger
Copy link
Collaborator

I think it depends on your use case. v2.5-L is competitive with v2(-H) at the 432px resolution for the LLaVA 1.5 metrics. Same goes for semantic segmentation. v2 is definitely still better at summary tasks (e.g. classification), and that holds up until the mode switch.

What caused this ? Is this only due to having a smaller architecture (L<H)

Yes, that's exactly what is going on. ViT-L has about half the number of parameters as ViT-H. ViT-B is something like 1/7th.

In this case do you plan to release a H version too

Yes, we have an H model that's being trained.

or do you advice to keep using v2.1 for "small" images ?

For small images, I would recommend trying both if you can spare the compute.

@mranzinger
Copy link
Collaborator

Hello, sorry this took so long. We just released radio_v2.5-h, which is an improvement over v2.5-l, and a big improvement over 2.1-h. Just make sure you run torch.hub.load with the force_reload=True flag the first time you try to run with the new model.

@Revist
Copy link

Revist commented Oct 10, 2024

Hi mranzinger,

I greatly like your work :)

Could you share how many gpus were used for approximately how much time to train biggest model?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants