Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Other pre-trained E-RADIO variants? #62

Open
collinmccarthy opened this issue May 31, 2024 · 4 comments
Open

Other pre-trained E-RADIO variants? #62

collinmccarthy opened this issue May 31, 2024 · 4 comments

Comments

@collinmccarthy
Copy link

collinmccarthy commented May 31, 2024

Hello,

In radio/eradio_model.py I see quite a few model definitions for the hybrid/e-radio variants:

  • fastervit2_large_fullres (ws7)
  • fastervit2_large_fullres_ws8
  • fastervit2_large_fullres_ws16
  • fastervit2_large_fullres_ws32
  • eradio_xxxtiny (ws16)
  • eradio_xxxtiny_8x_ws12
  • eradio_xxxtiny_8x_ws16

And then I see the eradio model is a wrapper around fastervit2_large_fullres_ws16, which matches what I see in radio/common.py.

Are there any other pre-trained e-radio models available right now or is it just this one "e-radio_v2"? I'm particularly interested in the (best performing) eradio_xxxtiny variant (probably one of the ws16 versions for a fair comparison). I'm assuming the older "eradio_v1" version is the same architecture as "e-radio_v2", but please correct me if I'm wrong.

Thanks!

@mranzinger
Copy link
Collaborator

Hi Collin,

So bad news/good news on this one. Unfortunately, the only pretrained E-RADIO we have right now is the e-radio_v2 one. However, we're putting together a training pass right now that will result in some new models. The first of which will be a ViT-B/16, most likely followed by a ViT-L/16, ViT-H/16, and a new E-RADIO. There was some demand from other people on an xxxtiny as well, so we can try to include it. What's your timeline for you work?

@collinmccarthy
Copy link
Author

Hi Mike,

That sounds great, thank you. If a pre-trained xxxtiny model were available anytime in June, or even early-mid July, that would be a huge help for me personally. For now I can use the existing e-radio model. I'm assuming the new E-RADIO model will include an update to the source code / model definition so I can track down the changes that were made, but if not I would really appreciate that as well.

@Ali2500
Copy link

Ali2500 commented Jun 7, 2024

Hi, my query is kinda related to this so I'll just put it here: are there any particular issues with training E-RADIO using bfloat16? I want to fine-tune E-RADIO for downstream tasks using bfloat16 and I'm trying to figure out if this is feasible or not.

@gheinrich
Copy link
Collaborator

Hello, you should be good with E-RADIO and bfloat16. I did notice some issues with float16 (NaN) but none with bfloat16 and its wider range.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants