Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] Implements Roberta Model #679

Draft
wants to merge 32 commits into
base: main
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from 1 commit
Commits
Show all changes
32 commits
Select commit Hold shift + click to select a range
d6284ca
[WIP] Implements Roberta Model
Jul 30, 2024
8f7402e
Implements dynamic masking objective
prady-saligram Jul 30, 2024
670b053
Implements dynamic masked dataset
prady-saligram Jul 30, 2024
42f5404
Reintroduced accidentally deleted CausalLMDataset class
prady-saligram Jul 30, 2024
9ad06af
Everything works except stuck on the final method,
Aug 1, 2024
53fd8d2
[WIP] Re-implements MLM training objective
prady-saligram Aug 5, 2024
dcd45b2
Adds error handling and reverts LmExample class to original
prady-saligram Aug 6, 2024
6f21e0d
Testing Modifications
Aug 13, 2024
730d847
Merge branch 'stanford-crfm:main' into roberta-model
prady-saligram Aug 26, 2024
027b176
Sets RobertaConfig as model architecture and creates default config file
prady-saligram Aug 26, 2024
399e08c
Adds compute_loss to roberta and changes positional ids to begin from 0
prady-saligram Sep 1, 2024
cd4118c
Investingating precision loss over time within the model using output…
Sep 4, 2024
96522f1
Merge branch 'roberta-model' of https://github.com/JulienDarve/levant…
Sep 4, 2024
8a732e5
Model can now successfully import weights from huggingface + made att…
Sep 10, 2024
5f3d8a2
Merge branch 'roberta-training' into roberta-model-copy-2
Sep 10, 2024
6c105f5
trial
Sep 12, 2024
ab85079
update 1
Sep 12, 2024
5b97400
update 2
Sep 12, 2024
bd7d411
update 3
Sep 12, 2024
b5d8e14
update
Sep 12, 2024
8717c3f
update
Sep 12, 2024
10c130c
update
Sep 12, 2024
834d88d
update
Sep 12, 2024
47fe23b
update
Sep 12, 2024
fb5c55c
update
Sep 12, 2024
8594e79
update
Sep 12, 2024
3ae80d7
update
Sep 12, 2024
de93fc9
update
Sep 12, 2024
896af7d
update
Sep 12, 2024
0be9a83
update
Sep 12, 2024
0c94a47
update
Sep 12, 2024
7ae681d
Training works!
JulienDarve Sep 13, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Loading
Loading