-
Notifications
You must be signed in to change notification settings - Fork 65
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
A few optimizations that speedup training #627
Conversation
Hey, @hvoss-techfak First, let's see if CI is green. |
I forgot to add the imports. One Second. |
Unfortunately, CI is not green, so, please, run tests locally (you will find the commands in the Makefile, you can also check the Contributing guide) |
@hvoss-techfak double check linters, please |
The imports are now fixed, but the tests still give me an error with the "labels = torch.IntTensor(labels)", as apparently not all triplet inputs are given as integers? I'll run the tests locally and check if I can fix this |
I relaxed the IntTensor to a normal tensor and now all the short tests pass. |
let's try one more time |
double check linters please |
That linters seems to be a formatting issue in the inbatch_hard_tri.py. It is fixed now. |
by the way, you can run command "make run_precommit", it will help to fix the linters |
@hvoss-techfak seems like the tests are fine now, so I will check the code soon could you run all_tests locally please? some my fail (because of environment, OS and so on), but it should be easy to understand if the are related to your changes or not |
All tests are now done. I had a few that failed because of my ddp setup and some import errors, but apart from that all other tests passed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey, @hvoss-techfak
Thank you for your contribution!
I left a couple of comments removing redundant code.
@hvoss-techfak check out the comments please |
all changes are done |
@DaloroAT thank you for review it! |
@hvoss-techfak welcome to the contributors, feel free to work on any issue you like :) |
Thanks for creating such a great project.
I'm currently training on a rather large dataset (500+ GB) of audio data and have found a few optimizations that increase the pre-processing speed in the balance sampler from 10+ minutes down to ~15 seconds and changes to the sampling function to remove the for loop and use only torch functions, that made some pretty big changes for me (from 2 iterations/s to 3.5 iterations/s).
Both changes produce exactly the same results as before.