You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I’m not too familiar with GNU parallel or Sage, but this looks like it would work well as a Job Array, which Slurm supports.
Slurm thinks in terms of tasks. For you, each task would be an individual Sage process. Each task in the array runs the same script, but each may do something different in that script based on which Task ID they are, such as run on a different parameter or set of parameters. In my experience, creating a single task for each parameter is a bad idea, it will bog down the scheduler and slow the scheduling process down for everyone. You also end up spending a lot of time starting up (starting your program, loading packages if needed, etc, adds up if you have a lot of tasks). It’s best to batch up your parameters so you have a set number of tasks that you pick, say 4 for example, that each iterate through a number of parameters.
I have some examples of this in a github repo for a cluster at MIT. Take a look at this Python Example, from the quick look I did it looks like Sage uses a lot of the same syntax as Python. You’ll want to look at both the Python and submission script.The trick is to first convert your code to one big for loop that iterates over the parameters you want to use. Then it’s a matter of adding about three lines to your code, and using a submission script like the one I have in the repo.
I’m not too familiar with GNU parallel or Sage, but this looks like it would work well as a Job Array, which Slurm supports.
Slurm thinks in terms of tasks. For you, each task would be an individual Sage process. Each task in the array runs the same script, but each may do something different in that script based on which Task ID they are, such as run on a different parameter or set of parameters. In my experience, creating a single task for each parameter is a bad idea, it will bog down the scheduler and slow the scheduling process down for everyone. You also end up spending a lot of time starting up (starting your program, loading packages if needed, etc, adds up if you have a lot of tasks). It’s best to batch up your parameters so you have a set number of tasks that you pick, say 4 for example, that each iterate through a number of parameters.
I have some examples of this in a github repo for a cluster at MIT. Take a look at this Python Example, from the quick look I did it looks like Sage uses a lot of the same syntax as Python. You’ll want to look at both the Python and submission script.The trick is to first convert your code to one big for loop that iterates over the parameters you want to use. Then it’s a matter of adding about three lines to your code, and using a submission script like the one I have in the repo.
Happy to answer any questions.
See Full Post
The text was updated successfully, but these errors were encountered: