-
Notifications
You must be signed in to change notification settings - Fork 468
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(cli): allow users to download models from Kaggle #2002
feat(cli): allow users to download models from Kaggle #2002
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/torchtune/2002
Note: Links to docs will display an error until the docs builds have been completed. ❗ 1 Active SEVsThere are 1 currently active SEVs. If your PR is affected, please view them below: ✅ No FailuresAs of commit 00493ef with merge base d9b6c79 (): This comment was automatically generated by Dr. CI and updates every 15 minutes. |
Awesome stuff!! Will be reviewing this today or tomorrow :) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few comments/questions mostly about authentication, but overall this looks great!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will wait for CI, but LGTM!
Thanks @KeijiBranshi for your hard work on this and making it easier for torchtune users to download models from the Kaggle Hub!
Thanks so much for your help and review! Really appreciated your guidance on everything. Looks like some of my test are failing. Could've sworn I ran before pushing, but i guess it slipped through. Resolving those now 🙂 |
@joecummings success! Thanks again for all your help. Let me know if anything else needs to be done for the release process 🚀 |
87d34bb
to
00493ef
Compare
@joecummings quick fyi, I just rebased to the most recent Let us know if there's anything else we should do to get this merged 🙂 |
Thanks for the rebase. Will merge this today!! Afterwards, I'll kick off a "nightly" build so users can have immediate access to this feature even before tomorrow's build of torchtune. |
Thank you so much!! Looking forward to it. Please ping me if things break or users see issues 🙂. If I'm unresponsive for any reason, you can also email [email protected]. |
Context
What is the purpose of this PR? Is it to
Please link to any issues this PR addresses.
#1852
Changelog
What are the changes made in this PR?
--source
,--kaggle-username
, and--kaggle-api-key
--source
value (only valid options arehuggingface
andkaggle
). When--source
is not provided, defaults tohuggingface
_download_from_kaggle()
_download_from_huggingface()
with a few notable exceptions:kagglehub
requires the caller's username and api token as separate arguments tokagglehub.set_kaggle_credentials()
(these can also be set as environment variables). Since both are required, appropriate error messages are raised when one is omitted. However, since Kaggle supports anonymous downloads on public non-gated models, the option to omit both is also supported.kagglehub.model_download
are different than those raised byhuggingface_hub.snapshot_download
.kagglehub.model_download
doesn't support direct download to a caller-specified directory via--output-dir
. This feature is being worked on inkagglehub
(see [RFC] Design proposal formodel_download
custom download path Kaggle/kagglehub#179), but in the meantime a warning was just placed here instead._validate_kaggle_model_handle
) as per the requests made here. This validation is primarily used to raise warnings when downloads may not be guaranteed to work with torchtune.--source kaggle
is providedTest plan
Please make sure to do each of the following if applicable to your PR. If you're unsure about any one of these just ask and we will happily help. We also have a contributing page for some guidance on contributing.
pre-commit install
)pytest tests
run recipe tests viapytest tests -m integration_test
manually run any new or modified recipes with sufficient proof of correctnessinclude relevant commands and any other artifacts in this summary (pastes of loss curves, eval results, etc.)UX
If your function changed a public API, please add a dummy example of what the user experience will look like when calling it.
Here is a docstring example
and a tutorial example