Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unexpected behavior of memory specification between AutoExecutor and SlurmExecutor #1761

Open
mshvartsman opened this issue Jan 31, 2024 · 0 comments

Comments

@mshvartsman
Copy link

mshvartsman commented Jan 31, 2024

I recently switched from using AutoExecutor to subclassing from SlurmExecutor directly to use a custom entrypoint, and had to modify executor parameters. It's a bit awkward that the parameters have different names (e.g. mem vs mem_gb) but this is relatively easy to catch since using the wrong name causes an error which prints all known names.

However, what is more challenging is that the units for memory also change and this isn't very obvious. I think it's because the logic which appends MB or GB to the memory amount (https://github.com/facebookincubator/submitit/blob/main/submitit/slurm/slurm.py#L525) only gets called when also converting the parameter names via AutoExecutor.

I think that the correct behavior is either to always convert the memory in the same way, or at least warn somewhere if no units are provided but are expected. Happy to PR a fix if there is agreement on how to proceed!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant