You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hey, sorry it's been a while. We are currently internally discussing providing better support for this and pre-training/sft in general. We plan to extend support to local and cloud storage (S3 etc).
Please check that this issue hasn't been reported before.
Expected Behavior
To work the same as when loading the dataset from HF
Current behaviour
Asks for a custom .py script
Steps to reproduce
Load a local json file:
pretraining_dataset: /home/sicarius/somefile.jsonl
type: pretrain
Config yaml
Possible solution
Treat it similarly as a loading a dataset from the HF hub
Which Operating Systems are you using?
Python Version
3.10
axolotl branch-commit
latest release
Acknowledgements
The text was updated successfully, but these errors were encountered: