Skip to content

Releases: allenai/OLMo

v0.2.0

10 Dec 06:43
Compare
Choose a tag to compare

What's new

Added 🎉

  • GPT-based model.
  • Tokenizer and data pre-processing pipeline.
  • training script.
  • Triton-based FlashAttention.

Commits

e801af8 add release proc
e643f5e update pyproject
dbc8177 Bump version to v0.2.0 for release
e99dbe5 Merge pull request #391 from allenai/hf-olmo-new
a120ab2 Merge pull request #380 from allenai/shanea/storage-cleaner-download-upload
4e849e4 Merge pull request #390 from allenai/shanea/storage-cleaner-archive-fix-2
1dbc346 Merge pull request #378 from allenai/shanea/storage-cleaner-cached-path
22cefa2 Merge pull request #389 from allenai/shanea/add-r2-scheme
ac01778 fix
6c79c63 add option to only unshard model
d1c185b Merge pull request #387 from allenai/epwalsh/dist-init
e30d29f Merge pull request #364 from allenai/shanea/storage-cleaner
ff883e5 Merge pull request #385 from allenai/epwalsh/max-duration-tokens
e16e606 Merge pull request #383 from allenai/epwalsh/start-new-epoch

v0.1.1

27 Nov 01:04
Compare
Choose a tag to compare

What's new

Commits

v0.1.0

27 Nov 00:49
Compare
Choose a tag to compare

What's new

Added 🎉

  • GPT-based model.
  • Tokenizer and data pre-processing pipeline.
  • training script.
  • Triton-based FlashAttention.

Commits

f1ba78e moving readme to notes
6c94994 Bump version to v0.1.0 for release
f09a500 Add a "constant" LR scheduler (#376)
dcdadc5 Merge pull request #377 from allenai/Muennighoff/split-model-comps
80b081b Merge pull request #374 from allenai/epwalsh/threaded-data-loading
9c8e67e Merge pull request #373 from allenai/chore/paths
1f51fec Merge pull request #375 from allenai/Muennighoff/move-torch-utils
9d5aa11 Merge pull request #370 from allenai/CheckpointLoading
38be6a7 Merge pull request #372 from allenai/epwalsh/optim-state-fix
c205912 Fix how we update grad_norm_exp_avg (#371)
9320f9b Fix unsharding local checkpoints w/ torch 2.1 (#369)
b8a174f Merge pull request #367 from allenai/FacePalm
13548fd Merge pull request #365 from allenai/wrap_and_shard
6c0e419 Add gradient clipping warmup (#363)
0afafd6 Fix stale links in README, scripts cleanup (#359)
42dba3c remove data team's stuff (#357)
4bb6966 consolidate Python configs into pyproject.toml, other clean up (#353)
a952f44 minor fixes to kempner docs (#354)
026793e Merge pull request #347 from allenai/epwalsh/block-groups-load-fix
62fc2fe Add two more FSDP wrapping strategies (#355)
4ccf2bd Merge pull request #346 from allenai/shanea/llama-block
da91f34 Merge pull request #317 from allenai/Llama
fd2425f Adds a YAML validator to automatically find the last checkpoint (#348)
1099942 Upload profiler data to remote save folder (#338)
db0756f Merge pull request #335 from allenai/Kempner
5c64338 Merge pull request #343 from allenai/ActivationCheckpointing
558102e Merge pull request #342 from allenai/S3Client
cd73387 Add option to FSDP wrap by groups of blocks (#340)
c1a4519 Fix dtype casting on CPU (#339)
104d1ce Move remaining checkpointing logic to Checkpointer class (#331)
a465caa Merge pull request #337 from allenai/UnshardSkipKeys
4980bad set mcli time limit to null
f974a1d update mitch ish configs
07404f8 Merge pull request #308 from allenai/fine-grained-metrics
4644ff5 Lazily init s3 client (#333)
809fe9d Load state dicts to CPU (#328)
1bff308 ensure bias is created in fp32 (#327)
d4744d0 Bring back global gradient clipping and improve speed of collecting metrics (#326)
54572d3 Add stop_at config option
e63b389 Fix SDP NaN bug (#323)
fddded5 Features to match OpenLM (#302)
d2e84fe Refactor checkpointing, bring back legacy sharded checkpointing as the default (#316)
fed4cf3 Merge pull request #311 from allenai/pass-thru-model-kwargs
a5cd0e6 Merge pull request #304 from allenai/ppl-suite-v3
536d029 Merge pull request #306 from allenai/keep-instance-info
0b5f68d Merge pull request #314 from allenai/ResetOptimizerState
e8bd122 Merge pull request #315 from allenai/MemoryEnvVar
da1f0b8 Merge pull request #313 from allenai/NanCheck
94133da Fixes pyspy script
602968a New-style checkpointing (again) (#307)
973090f implement bytes range for GS
18e061d Merge pull request #303 from allenai/shanea/fix-leftover-data-partitioning
0a1455b Merge pull request #301 from allenai/shanea/fix-s3-keyerror-failures
e7b92a6 comment
6ebd5d3 Add configs for v1.5 mix
8e2b8be Merge pull request #297 from allenai/PerfTests
62dde55 Make resource_path() more robust
900544e Prepare 7B config for MCLI (#295)
309bf84 Merge pull request #294 from allenai/petew/linear-schedule
91f499b Ignore warnings from urllib3, don't print config when it's huge
012e97f Merge pull request #290 from allenai/torch2.1init
aec449c update mcli config
27dd512 MCLI configs (#286)
a2b369a Merge pull request #279 from allenai/petew/train-metrics
5ad0d8c Merge pull request #282 from allenai/rsqrt
cc787ed Merge pull request #277 from allenai/shanea/add-truncated-normal-init
fabda71 Merge pull request #274 from allenai/petew/layer-norm
70a3f4c Merge pull request #280 from allenai/petew/reduce-dtype
2a7f694 Merge pull request #278 from allenai/update-hf-olmo-config
ef85d5c Merge pull request #265 from allenai/LayerNormAffine-ManualLayerNorm-Profiling
2df922b Merge pull request #276 from allenai/petew/sys-metrics
921c254 Merge pull request #275 from allenai/simplify-eff-benchmark
400a1d2 Minor cleanup of grad clipping (#273)
18f3459 fix updating grad_norm_exp_avg (#272)
54dbd48 Merge pull request #238 from allenai/inference-efficiency-pentathlon
95555f4 Refactor how we clip gradients and collect optimizer metrics (#261)
6cc09fe Merge pull request #271 from allenai/PythonProfiling2-UnwindingChanges
41b0663 Merge pull request #269 from allenai/PythonProfiling
2eedf07 Fix speed issue on LUMI with 7B model (#270)
d2abecd Merge pull request #267 from allenai/v2-pii-tagging
5b4c68e fix isort config
c8a2700 Merge pull request #253 from allenai/SavedTokenizer
26e17c3 Merge pull request #264 from allenai/LayerNormAffine-ManualLayerNorm-TurnedOffForSafety
a49f4ec Make Dropout a no-op when p=0.0 (#259)
a33dbb0 make flake8 happy
6b977d0 handle race conditions when saving to NFS on cirrascale (#255)
b4a1491 Merge pull request #250 from allenai/LayerNormAffine
4205a84 Merge pull request #257 from allenai/FasterGlobalIndices
e46b988 fix saving unsharded checkpoints
5fff93a Merge pull request #251 from allenai/soldni/fix-s2-fos
af0a584 Merge pull request #248 from allenai/TokenizerFromFile
7fbdb1c finish W&B runs quietly
9071816 Training improvements (#239)
642d0fa Add support for remote checkpoints and train data files (#237)
e350fd3 Add option to restart with new base LR (#236)
3ef79e1 Merge pull request #230 from allenai/eval-streamline
51a8a00 load state dict on gpu
3e8163e improve config resolution
7bd0ed2 medium script update
27d3538 add V1 mix small+medium configs (#211)
907e38b wait on all ranks until final ckpt dir exists (#235)
2118db5 Merge pull request #232 from allenai/ablations/soldni-gantry
698f859 Added the shuffling story
5508c04 Use numpy for shuffling instead of torch (#231)
952819b Don't reshuffle eval data each "epoch" (#229)
87f6a79 Merge pull request #223 from allenai/soldni/olmo-mixing
e64cf42 Merge pull request #227 from allenai/hf-olmo-tok
970a77c add more tests for memmap dataset
ba84b0b default to saving data indices
d02d4f1 Merge pull request #221 from allenai/faster-convert
acf372e Merge pull request #220 from allenai/hf-integration
43c29d9 Merge pull request #219 from allenai/iterable-dataset-memory-efficient
d3d00f1 Merge pull request #217 from allenai/soldni/lucy-fix
7c866c9 Merge pull request #216 from allenai/petew-cache-attn
05c6d53 clean up
fd1cfe8 Merge pull request #213 from allenai/llm-inference
ccb3869 Merge pull request #212 from allenai/gopher-fix
a80cdc1 Merge pull request #210 from allenai/soldni/filters_improvements
66c4936 fix c4-medium config
fde42f9 Merge pull request #194 from allenai/default-2x-batch-size
ab0b967 Merge pull request #209 from allenai/olmo-mix-1
b376486 Merge pull request #200 from allenai/c4-gopher-dedupe
b1584f9 Merge pull request #207 from allenai/petew-no-par-block
96f8817 Merge pull request #208 from allenai/soldni/tok_sample_code
a244f3a Merge pull request #203 from allenai/error-handling
86060d4 Merge pull request #199 from allenai/packed-evals
992838b Merge pull request #205 from allenai/hatespeech-nsfw-mixers
186fe1b Merge pull request #204 from allenai/nishant_pi_count_ablation
0b55217 Merge pull request #188 from allenai/ft-tagger-dataset
6a36cdf Merge pull request #161 from allenai/AkshitaB-stack-ablations
4074e42 Merge pull request #201 from allenai/soldni/tok_sample_code
58ad163 Merge pull request #197 from allenai/soldni/local_cache
c642d4f Merge pull request #198 from allenai/nishant_add_pi_counts_filter
eccf18c Merge pull request #162 from allenai/c4-gopher
49f9a0e Merge pull request #191 from allenai/save-indices
e89c61f Merge pull request #195 from allenai/code-eval
d33ea74 Merge pull request #193 from allenai/soldni/neox
9bfcde3 Fix secrets name in LUMI.md (#190)
b6fa4d9 add PPL evaluators to medium config
484b089 Merge pull request #189 from allenai/docs
83b39b5 remove unnused thread lock
ebc07f4 Merge pull request #187 from allenai/soldni/tokfile
ed7c0e8 ensure drop_last=True with train data
4d986ed fix speed monitor
2437cdf Merge pull request #181 from allenai/v0-small
3478cb0 Merge pull request #186 from allenai/soldni/tok_improve
348ed33 Merge pull request #185 from allenai/soldni/falcon
b79c3b7 Speed up preprocessing script (#177)
cb2c9cd Merge pull request #184 from allenai/format
72d4ff2 Merge pull request #183 from allenai/attr-merge
47c4ab9 More checkpointing improvements (#182)
a97d1f6 Merge branch 'main' of https://github.com/allenai/LLM into main
2567261 handle empty logzio token
0507c2d Restore dataset correctly when world size changes (#176)
f8eeb22 Merge pull request #178 from allenai/soldni/preview
9b21211 Merge pull request #179 from allenai/fix-tests
0d487c2 Merge pull request #174 from allenai/v0-small
4737c53 Merge pull request #175 from allenai/ClearGPUsFirst
1bdeae6 Merge pull request #173 from allenai/span-fix
87476f7 Prepare 1B baseline run (#170)
434cf67 Merge pull request #172 from allenai/fix-tests
d2442d6 add more tests
6d29ee4 fix dataloader max steps
7ffe204 Merge pull request #171 from allenai/soldni/decontamination-v2
0a485d2 Merge pull request #168 from allenai/DockerImage
27a3f3a Don't be so noisy during startup
1fba808 Merge pull request #165 from allenai/c4-medium-2x-bz
9da0e4b Merge pull request #167 from allenai/v1-small-config
391091c Merge pull request #166 from allenai/soldni/ablations_v2
9020c91 Merge pull request #164 from allenai/add-no-grad
c25f54b Merge pull request #163 from allenai/soldni/ablations
41a9969 syncronize time limits
1cd0b4b Merge pull request #134 from allenai/dependabot/pip/mypy-gte-1.0-and-lt-1.4
2a4031e Merge pull request #160 from allenai/soldni/filter-speedup
...

Read more