-
Notifications
You must be signed in to change notification settings - Fork 20
Dortmund Usage
- Get a LIDO account for yourself. Look here and here
- Send a mail to Sebastian Krey to get included into our lido-users mailing list. Sometimes Bernd and Michel post relevant infos here. If you have questions, MAIL TO THIS LIST and not to Bernd or individually.
- Understand how software and modules work on LIDO. Here is the list of installed stuff.
Most useful software must be loaded via module commands.
-
module avail
lists all available modules for the user. -
module list
lists all currently activated modules for the user. -
module add modul1 [modul2 ...]
activates modul modul1. -
module remove modul1 [modul2 ...]
removes modul modul1. -
module purge
removes all activated modules. - To be able to work with the queuing system you have to load the torque and maui modules via
module add torque maui
On the slaves, modules are loaded viabatchjobs_lido.tmpl
(see below) automatically, so you don't need to do this. I have these lines in my.bashrc
. You probably want to have those as well.
-
case "`hostname`" in
lidong[12])
module add python/2.7.2
module add torque maui
module add subversion
module add git
module add binutils
module add gotoblas/shared/64/1.26
module add gcc/4.8.5
module add R/3.2.2-gcc48-base
alias myjobs='qstat -u $USER'
TERM="xterm-256color"
;;
esac
- Log into Lido head lidong1.itmc.tu-dortmund.de per SSH.
- Install BatchJobs and BatchExperiments from CRAN.
- Understand what queues exist on LIDO, what resources exist and so on by reading this wiki page.
- Read and understand the documentation header of
/home/groups/stattmpl/batchjobs_lido.tmpl
, to understand what job resources are available and how they work:less /home/groups/stattmpl/batchjobs_lido.tmpl
- Read the configuration documentation. Then create a valid config file in your home directory, so at
~/.BatchJobs.R
. Here is a template:
cluster.functions = makeClusterFunctionsTorque("/home/groups/stattmpl/batchjobs_lido.tmpl")
mail.start = "first+last"
mail.done = "first+last"
mail.error = "all"
mail.from = "<[email protected]>"
mail.to = "<[email protected]>"
mail.control = list(smtpServer="mail.statistik.tu-dortmund.de")
default.resources = list(
R = "R-3.2.2-gcc-4.8.5-base",
modules = "",
walltime = 3600L,
memory = 2048L,
# parcpus is mapped to Torque resource 'nodes', better use this name,
# so you dont have to change anything when you use our SLURM cluster
parcpus = 1L
)
staged.queries = TRUE
debug = FALSE
If you want to use event emails, the sender address does not matter and does not need to exist. But your receiver address must be valid of course. I think you need a @statistik mail address. Or figure out which SMTP server to use.
You should probably upgrade the R version in the default resources when LIDO installs new R versions and it should probably correspond to the R version you use on the master node.
- DO NOT change the first line in the config template above and DO NOT COPY the
batchjobs_lido.tmpl
to your local home dir or create your own. It is very likely that you do not understand enough details of the system to do this properly. Copying it will prevent you from getting nice updates from Bernd and Michel. - Run a simple batchMap example. For the first try you should probably set
debug = TRUE
in the config, so you can better understand errors. If everything works, set debug back toFALSE
. - On the bash console, this stuff is useful:
*
qstat
will display all jobs *qstat -u $USER
will display your jobs (or define myjobs in .bashrc) *kill_all_jobs
will kill ALL of your jobs. It is a bash script by Sebastian Krey in/home/groups/stattmpl/bin
. *show-queues
displays a nice, alternative status overview of the queues and your jobs. It is not perfect but mainly gets the job done. It is a python script by Bernd in/home/groups/stattmpl/bin
. *show-active-users
displays a nice, alternative status overview of what users currently do. It is not perfect but mainly gets the job done. It is an R/shell script by Bernd in/home/groups/stattmpl/bin
. * You can use the scripts in the bin directory of the group 'stattmpl' (statistic templates) by adding this line to your .bashrc:PATH=$PATH:/home/groups/stattmpl/bin
- R packages must be installed and managed by yourself.
If you ever need to update the Rmpi package, you should do this:
wget http://cran.r-project.org/src/contrib/Rmpi_0.6-5.tar.gz
module add openmpi/ge/gcc4.8.x/64/1.6.4
R CMD INSTALL Rmpi_0.6-5.tar.gz --configure-args=--with-mpi=/sysdata/shared/sfw/openmpi/gcc4.8.x/64/1.6.4
Of course you need to adjust the names / paths in the last command. Look up the mpi module name in batchjobs_lido.tmpl
.
- Read the additional documentation provided by Sebastian.
- Send a mail to Sebastian Krey to get included into our lido-users mailing list and get access to the cluster. Sometimes Bernd and Michel post relevant infos here. If you have questions, MAIL TO THIS LIST and not to Bernd individually.
- Log into shell.statistik.tu-dortmund.de per SSH.
- Get an interactive job for a few hours by typing:
interactive
. - Read and understand the documentation header of dortmund_fk_statistik.tmpl, to understand what job resources are available and how they work:
less /opt/R/BatchJobs/dortmund_fk_statistik.tmpl
- Read the configuration documentation. Then create a valid config file in your home directory, so at
~/.BatchJobs.R
. Here is a template:
cluster.functions = makeClusterFunctionsSLURM("/opt/R/BatchJobs/dortmund_fk_statistik.tmpl")
mail.start = "first+last"
mail.done = "first+last"
mail.error = "all"
mail.from = "<me@shell>"
mail.to = "<[email protected]>"
mail.control = list(smtpServer="mail.statistik.tu-dortmund.de")
default.resources = list(
walltime = 3600L,
memory = 512L,
# parcpus is mapped to SLURM resource 'ntasks', better use this name,
# so you dont have to change anything when you use our LIDO cluster
parcpus = 1L,
ncpus = 1L
)
staged.queries = TRUE
max.concurrent.jobs = 450
debug = FALSE
If you want to use event emails, the sender address does not matter and does not need to exist. But your receiver address must be valid of course. You need a @statistik or Unimail address. Alternatively figure out which SMTP server and login data to use for a different mail provider.
- DO NOT change the first line in the config template above and DO NOT COPY the
dortmund_fk_statistik.tmpl
to your local home dir or create your own. It is very likely that you do not understand enough details of the system to do this properly. Copying it will prevent you from getting updates from Sebastian. - Run a simple batchMap example. For the first try you should probably set
debug = TRUE
in the config, so you can better understand errors. If everything works, set debug back toFALSE
. - On the bash console, this stuff is useful:
-
squeue
will display all jobs -
squeue -u $USER
will display your jobs -
kill_all_jobs
will kill ALL of your jobs (except for the interactive ones)
-