You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
It seems that RADIOv2.1 is still much better than v2.5 for images with a resolution smaller than 700px, according to your technical report. What caused this ? Is this only due to having a smaller architecture (L<H). In this case do you plan to release a H version too or do you advice to keep using v2.1 for "small" images ?
Thanks
The text was updated successfully, but these errors were encountered:
I think it depends on your use case. v2.5-L is competitive with v2(-H) at the 432px resolution for the LLaVA 1.5 metrics. Same goes for semantic segmentation. v2 is definitely still better at summary tasks (e.g. classification), and that holds up until the mode switch.
What caused this ? Is this only due to having a smaller architecture (L<H)
Yes, that's exactly what is going on. ViT-L has about half the number of parameters as ViT-H. ViT-B is something like 1/7th.
In this case do you plan to release a H version too
Yes, we have an H model that's being trained.
or do you advice to keep using v2.1 for "small" images ?
For small images, I would recommend trying both if you can spare the compute.
Hello, sorry this took so long. We just released radio_v2.5-h, which is an improvement over v2.5-l, and a big improvement over 2.1-h. Just make sure you run torch.hub.load with the force_reload=True flag the first time you try to run with the new model.
Hi
It seems that RADIOv2.1 is still much better than v2.5 for images with a resolution smaller than 700px, according to your technical report. What caused this ? Is this only due to having a smaller architecture (L<H). In this case do you plan to release a H version too or do you advice to keep using v2.1 for "small" images ?
Thanks
The text was updated successfully, but these errors were encountered: