Fix potential off-by-one error in attention mask generation #76

dibyaghosh · 2024-04-15T19:01:22Z

TL;DR: There is a small bug in the attention masking code; this should not practically affect anyone using the released model or training their own models (unless you're doing some special attention mask scheme), but we will fix it soon in an update.

The issue: If you have multiple timestep groups, the bug causes the first token in the second group to be misclassified as being in the first group (similarly, 1st token of 3rd group is misclassified as being in group 2, so on). If your model relies on different timestep groups not being able to attend to each other (this is a pretty non-standard case), then this could cause undesired information leakage.

For most people (if you are using the released model checkpoints, if you are using our config for pretraining), it should not affect any of your use cases. There might be some weird behavior if you try specifying readouts to a non-standard value in

octo/octo/model/octo_module.py

Line 91 in bd930f9

readouts: Optional[Sequence[str]] = None,

…idation Make Validation Metrics more meaningful on RTX

Fix potential off-by-one error in attention mask generation

c4c222a

dibyaghosh mentioned this pull request Apr 15, 2024

Incorrect attention mask computation #75

Open

WenchangGaoT pushed a commit to WenchangGaoT/octo1 that referenced this pull request May 10, 2024

Merge pull request octo-models#76 from rail-berkeley/dibya-update-val…

023d238

…idation Make Validation Metrics more meaningful on RTX

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix potential off-by-one error in attention mask generation #76

Fix potential off-by-one error in attention mask generation #76

dibyaghosh commented Apr 15, 2024 •

edited

Loading

Fix potential off-by-one error in attention mask generation #76

Are you sure you want to change the base?

Fix potential off-by-one error in attention mask generation #76

Conversation

dibyaghosh commented Apr 15, 2024 • edited Loading

dibyaghosh commented Apr 15, 2024 •

edited

Loading