-
Notifications
You must be signed in to change notification settings - Fork 36
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Other pre-trained E-RADIO variants? #62
Comments
Hi Collin, So bad news/good news on this one. Unfortunately, the only pretrained E-RADIO we have right now is the |
Hi Mike, That sounds great, thank you. If a pre-trained xxxtiny model were available anytime in June, or even early-mid July, that would be a huge help for me personally. For now I can use the existing e-radio model. I'm assuming the new E-RADIO model will include an update to the source code / model definition so I can track down the changes that were made, but if not I would really appreciate that as well. |
Hi, my query is kinda related to this so I'll just put it here: are there any particular issues with training E-RADIO using bfloat16? I want to fine-tune E-RADIO for downstream tasks using bfloat16 and I'm trying to figure out if this is feasible or not. |
Hello, you should be good with E-RADIO and bfloat16. I did notice some issues with float16 (NaN) but none with bfloat16 and its wider range. |
Hello,
In radio/eradio_model.py I see quite a few model definitions for the hybrid/e-radio variants:
fastervit2_large_fullres
(ws7)fastervit2_large_fullres_ws8
fastervit2_large_fullres_ws16
fastervit2_large_fullres_ws32
eradio_xxxtiny
(ws16)eradio_xxxtiny_8x_ws12
eradio_xxxtiny_8x_ws16
And then I see the
eradio
model is a wrapper aroundfastervit2_large_fullres_ws16
, which matches what I see in radio/common.py.Are there any other pre-trained e-radio models available right now or is it just this one "e-radio_v2"? I'm particularly interested in the (best performing)
eradio_xxxtiny
variant (probably one of the ws16 versions for a fair comparison). I'm assuming the older "eradio_v1" version is the same architecture as "e-radio_v2", but please correct me if I'm wrong.Thanks!
The text was updated successfully, but these errors were encountered: