You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I recently switched from using AutoExecutor to subclassing from SlurmExecutor directly to use a custom entrypoint, and had to modify executor parameters. It's a bit awkward that the parameters have different names (e.g. mem vs mem_gb) but this is relatively easy to catch since using the wrong name causes an error which prints all known names.
However, what is more challenging is that the units for memory also change and this isn't very obvious. I think it's because the logic which appends MB or GB to the memory amount (https://github.com/facebookincubator/submitit/blob/main/submitit/slurm/slurm.py#L525) only gets called when also converting the parameter names via AutoExecutor.
I think that the correct behavior is either to always convert the memory in the same way, or at least warn somewhere if no units are provided but are expected. Happy to PR a fix if there is agreement on how to proceed!
The text was updated successfully, but these errors were encountered:
I recently switched from using
AutoExecutor
to subclassing fromSlurmExecutor
directly to use a custom entrypoint, and had to modify executor parameters. It's a bit awkward that the parameters have different names (e.g.mem
vsmem_gb
) but this is relatively easy to catch since using the wrong name causes an error which prints all known names.However, what is more challenging is that the units for memory also change and this isn't very obvious. I think it's because the logic which appends MB or GB to the memory amount (https://github.com/facebookincubator/submitit/blob/main/submitit/slurm/slurm.py#L525) only gets called when also converting the parameter names via
AutoExecutor
.I think that the correct behavior is either to always convert the memory in the same way, or at least warn somewhere if no units are provided but are expected. Happy to PR a fix if there is agreement on how to proceed!
The text was updated successfully, but these errors were encountered: