Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Request] Release BF16 version of 0.6B 1024x model #109

Open
Bocchi-Chan2023 opened this issue Dec 22, 2024 · 4 comments
Open

[Request] Release BF16 version of 0.6B 1024x model #109

Bocchi-Chan2023 opened this issue Dec 22, 2024 · 4 comments
Labels
working working on this issue

Comments

@Bocchi-Chan2023
Copy link

Hi, I was so impressed with the quality of your 0.6b 1024x model that I tried to train it, but it disintegrated immediately. Is it possible to release a BF16 model like the other sizes?

@lawrence-cj
Copy link
Collaborator

Sure. blocked by other thing recently. will release later.

@lawrence-cj lawrence-cj added the working working on this issue label Dec 22, 2024
@Bocchi-Chan2023
Copy link
Author

Bocchi-Chan2023 commented Dec 22, 2024

Sure. blocked by other thing recently. will release later.

Thanks, I'll look forward for it.
can you tell me how to make bf16 models from scratch? (or how to make empty model?) I'll try to train too

@lawrence-cj
Copy link
Collaborator

Does the training guidance work for you?

https://github.com/NVlabs/Sana#-3-how-to-train-sana

@Bocchi-Chan2023
Copy link
Author

Bocchi-Chan2023 commented Dec 24, 2024

Does the training guidance work for you?

https://github.com/NVlabs/Sana#-3-how-to-train-sana

it seems to be working, but unfortunately, I got oom at training...
maybe I'll try it on runpod

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
working working on this issue
Projects
None yet
Development

No branches or pull requests

2 participants