Skip to content

Rocky8: Conda‐version

Sander W. van der Laan edited this page Aug 6, 2024 · 8 revisions

These instructions are for Rocky8 on a high-performance computer cluster, such as the HPC at the UMC Utrecht. Any given HPC is usually managed and users do not normally have administrator rights. Therefore it is advised to install a distribution of conda. Another advantage is that using a virtual environment within an conda-installment would preserve versions of installed packages.

Step 1: get conda

We require conda to have full control on the installation of required libraries and packages for slideToolKit, it will work with python 3.7+; we prefer python 3.8+.

Step 2A: update conda

Perhaps conda is already installed. Just make sure it is up-to-date and move to Step 3.

conda update -n base conda

Step 2B: installation conda

Versions and installation

There are multiple flavors of conda: anaconda, miniconda, or mamba. The former is famous and bloated; miniconda is a much lighter version which will work for most cases. mamba is a C++ re-implementation of conda and (therefore) super-fast. Finding and updating packages is much faster, yet it has all the features that the 'regular' anaconda or miniconda have. That said, it's really up to you. Whichever you choose, the same process as described below applies for each type of conda.

Choose your Conda

  • anaconda, this should be Anaconda3-2021.05-Linux-x86_64.
  • miniconda, this should be Miniconda3-latest-Linux-x86_64.
  • mamba, this should be Mambaforge-Linux-x86_64.

We prefer mamba.

wget https://github.com/conda-forge/miniforge/releases/latest/download/Mambaforge-Linux-x86_64.sh

Next execute the following code.

bash Mambaforge-Linux-x86_64.sh

And follow the instructions.

When prompted to indicate what the location of the installation should be, we do put it somewhere everyone can use it.

/hpc/local/$MY_DISTRO/$MY_GROUP/software/mambaforge3

Wrap-up installation

If you don't want conda to be loaded on startup, you can execute the following command. We advice you to initialize conda and let it load during startup/login.

conda config --set auto_activate_base false

If all is well, conda will have added something to your .bashrc that looks like this:

# >>> conda initialize >>>
# !! Contents within this block are managed by 'conda init' !!
__conda_setup="$('/hpc/local/$MY_DISTRO/$MY_GROUP/software/mambaforge3/bin/conda' 'shell.bash' 'hook' 2> /dev/null)"
if [ $? -eq 0 ]; then
    eval "$__conda_setup"
else
    if [ -f "/hpc/local/$MY_DISTRO/$MY_GROUP/software/mambaforge3/etc/profile.d/conda.sh" ]; then
        . "/hpc/local/$MY_DISTRO/$MY_GROUP/software/mambaforge3/etc/profile.d/conda.sh"
    else
        export PATH="/hpc/local/$MY_DISTRO/$MY_GROUP/software/mambaforge3/bin:$PATH"
    fi
fi
unset __conda_setup

if [ -f "/hpc/local/$MY_DISTRO/$MY_GROUP/software/mambaforge3/etc/profile.d/mamba.sh" ]; then
    . "/hpc/local/$MY_DISTRO/$MY_GROUP/software/mambaforge3/etc/profile.d/mamba.sh"
fi
# <<< conda initialize <<<

In this code $MY_DISTRO is the name of your server, for example Rocky8 or CentOS7. $MY_GROUP is the name of your group on the server, for example gendep, or depcardio.

Don't forget to cleanup afterwards.

rm -v Mambaforge-Linux-x86_64.sh

Restart your shell and check the installation

Restart your shell:

source $HOME/.bashrc
source $HOME/.bash_profile

You should now have your fresh conda installation in you path at startup. Simply check it by asking which pip is used.

which pip

This will return:

/hpc/local/$MY_DISTRO/$MY_GROUP/software/mambaforge3/bin/pip

Step 3: installation slideToolKit

Download and install the latest version of the slideToolKit from GitHub. First create and go to the git directory, then download the slideToolKit.

/hpc/local/$MY_DISTRO/$MY_GROUP/software: this is the folder where you are supposed to install software on your system.

mkdir -p /hpc/local/$MY_DISTRO/$MY_GROUP/software/ && cd /hpc/local/$MY_DISTRO/$MY_GROUP/software
if [ -d /hpc/local/$MY_DISTRO/$MY_GROUP/software/slideToolKit/.git ]; then \
		cd /hpc/local/$MY_DISTRO/$MY_GROUP/software/slideToolKit && git pull; \
	else \
		cd /hpc/local/$MY_DISTRO/$MY_GROUP/software/ && git clone https://github.com/swvanderlaan/slideToolKit.git; \
	fi

Add symbolic links in ~/bin/. Now the slideToolKit will be availabe in your PATH. Adding the slideToolKit tools to your PATH makes it easier to acces the slideToolKit commands.

mkdir -p ~/bin/ && ln -s -f -v /hpc/local/$MY_DISTRO/$MY_GROUP/software/slideToolKit/slide* ~/bin/

Step 4: install a virtual environment

Next you should create a virtual environment within which we will install the required packages for slideToolKit and CellProfiler. There are two approaches.

Using a yml-file

We created several yml which will setup a virtual environment and installs the required packages through conda and pip3. You can find the yml-files here: [PATHTO]/slideToolKit/conda_yml/.

Next enter the following.

mamba env create -f [PATHTO]/slideToolKit/conda_yml/mamba3_8_cp4.v1.yml

Do it yourself

Alternatively you can do this yourself. First, you'll create an environment.

mamba create --name cp4 python=3.8

Next, you activate this environment - you 'step' inside of it as it were.

mamba activate cp4

Now, you can install everything in one go.

mamba install --channel bioconda pip numpy matplotlib pandas openjdk scikit-learn mahotas gtk2 Jinja2 inflect wxpython mysqlclient sentry-sdk centrosome gensim FuzzyTM xarray python-javabridge bftools cairo freetype gettext giflib imagemagick java-jdk jpeg wmctrl zbar tclap openslide-python && pip install cellprofiler arrow pathlib opencv-contrib-python 

The --channel bioconda flag indicates that some libraries/packages should be looked for in bioconda.

Controlling the virtual environment

Activating

Activating (or switching between) your virtual environment(s) is easy.

mamba activate cp4

This modifies the PATH and shell variables to point your macOS to the specific python set-up you (just) installed. You'll note that the command prompt now indicates which Conda environment you are currently in by prepending (cp4).

Listing available environments

You can also list the available environments. I tend to forget what I installed, so it comes in handy for me 🙈.

mamba env list

This results in the following for example:

# conda environments:
#
base                  *  /hpc/local/$MY_DISTRO/$MY_GROUP/software/mambaforge3/
cp4                      /hpc/local/$MY_DISTRO/$MY_GROUP/software/mambaforge3/envs/cp4

Installing additional python packages

The big advantage of using conda is that you can create a virtual environment that contains a specific set-up of python packages for a specific purpose. Just select your virtual environment -n cp4 and the [package] you wish to install.

mamba install -n cp4 [package]

Don't forget to specify the virtual environment, because otherwise the package will be installed in the root python installation. You can use this to install any missing packages that are required, that you discovered above in Step 1.

Deactivating/exiting the virtual environment

To end a virtual environment session, i.e. deactivating or exiting, is easy. This will reset the PATH and shell to the base settings of macOS.

mamba deactivate

Deleting a virtual environment

You may want to delete a specific conda environment. You can do this by entering the following.

mamba remove -n cp4 -all

Other useful commands

List all the conda environments available:

mamba info --envs

Create new environment named as envname.

mamba create --name envname

Remove environment and its dependencies.

mamba remove --name envname --all

Clone an existing environment.

mamba create --name clone_envname --clone envname

Wrapping up the virtual environment installation

Restart your shell:

source $HOME/.bashrc
source $HOME/.bash_profile

Step 5: test the environment

And now you should not have any problem running the following script.

cd /hpc/local/$MY_DISTRO/$MY_GROUP/software/slideToolKit
python slideToolKitTest.py

This will calculate your age, just for fun. But the most important is to check whether you have the right versions of openslide, opencv and CellProfiler installed. These should be the following.

Printing the installed versions.
* Python version:  3.8.12 | packaged by conda-forge | (default, Oct 12 2021, 21:50:38)
[Clang 11.1.0 ]
* OpenSlide version:  1.1.2
* OpenSlide library version:  3.4.1
* CellProfiler version:  4.2.6

Inspired by

https://www.pyimagesearch.com/2019/01/30/macos-mojave-install-tensorflow-and-keras-for-deep-learning/
https://medium.com/swlh/how-to-setup-your-python-projects-1eb5108086b1
https://towardsdatascience.com/how-to-successfully-install-anaconda-on-a-mac-and-actually-get-it-to-work-53ce18025f97
https://gist.github.com/rxaviers/7360908
https://github.com/CellProfiler/CellProfiler/wiki/Conda-Installation

Clone this wiki locally