[memory
] Avoid storing trainer in ModelCardCallback and SentenceTransformerModelCardData
#3144
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Resolves #3136
Hello!
Pull Request overview
Details
This seems to prevent cleanup, as there's a cyclical dependency between trainer -> model -> model card -> trainer. This means that once the trainer and model get overridden (e.g. in https://github.com/UKPLab/sentence-transformers/blob/master/examples/training/data_augmentation/train_sts_seed_optimization.py), the old model/trainer/model_card_data don't get automatically eaten by the garbage disposal.
I've moved a lot of components around, and now ModelCardCallback nor SentenceTransformerModelCardData need to store the Trainer. Although annoying, this does mean that memory should be cleared if the model/trainer gets overridden/deleted.
Before:
Approximate highest recorded VRAM during train_sts_seed_optimization:
After
Approximate highest recorded VRAM during train_sts_seed_optimization:
Note that the VRAM usage does still grow, albeit a lot more slowly, so this might not have resolved all issues. Having said that, because most people only make 1 trainer, it's not that big of an issue I suspect.