Close menu Resources for... William & Mary
W&M menu close William & Mary

Python use on HPC clusters

IMORTANT INFO ABOUT ANACONDA on HPC:

 

What happened to the Anaconda3 software modules on HPC systems?

As of Feb. 1st 2025, RC/HPC will no longer be supplying Anaconda3 modules on the HPC clusters.  You will be able to still use your Conda style environments using the drop-in replacement "miniforge3" which is installed on all HPC clusters (and has a software module).     

Why?

The latest TOS from Anaconda3 from March 2024 and states that academic organizations with more than 200 employees must have a license to use Anaconda for non-curriculum based activity.   In September of last year, Anaconda posted an update regarding use in Academic settings for non-curriculum activities and stated that the license requirement is meant for mainly commercial users.  However, there is still some uncertainty about what requires a license moving forward.  Because of this uncertainty, we feel it important to drop Anaconda support and move all users over to miniforge3 for conda environments.   

Will I need to change any of my conda environments?  

Your conda environments will still work the same with miniforge3.   However, since the new Terms of Service also affects the Anaconda3 non-free "default" and "anaconda" channels, we suggest re-making conda environments using the conda-forge or other free channels (like bioconda) if you are concerned about intellectual property or eventual commercialization of your research.   Those that use conda environments for educational purposes in an accredited class can also still use the non-free channels.  Miniforge3 uses the conda-forge channel by default.

NOTE

Using this help relies on you understanding the use of software environment modules and what type of shell you are using.   You can use the command echo $0 to check which shell you use (tcsh or bash), e.g.:

>> echo $0
-tcsh

This shows that I use a tcsh shell for my environment (the majority of HPC users get tcsh)

Python environments  

Python is an extermely popular programming language that is ubiquitous in research computing.  Depending on your use case, you may need to just run a simple python command or you may need to build a complex python environment with many separate, dependent python modules.  

Miniforge3 is a software package that includes the conda and mamba python installation tools.  Miniforge3 is a drop in replacement to conda, which we will no longer make available to users by default.  

To use Python on the HPC clusters, you have two choices:

  1. Use the Miniforge3 tools like conda or mamba to prepare a conda environment that you will use for calculations.
  2. Use Python virtual environments and forgo conda style environments.

Either approach will work on the HPC clusters.

Setting up a Conda Python environment 

To create an conda environment for log into an HPC front-end, such as bora. For tcsh users, conda can be added to your environment by loading an appropriate environment module:

module load miniforge3

FOR BASH users:

Unfortunately, conda does not activate properly using the module system within the bash environment.  Therefore, after loading the conda environment module, one additional step is needed.  You will need to enter:

eval "$(conda shell.bash hook)"

On the command-line before you want to activate an conda environment.

NOTE: Unlike the conda documentation, we don't recommend running 

conda init <shell> 

on the HPC cluster since this can disturb your usual shell environment.

Once conda is properly loaded, you can use:

conda create -n <env name>

to create your environment

The new environment can be activated using 

conda activate <env name>

once in the environment, you can use conda install to install packages (see conda doc)  or install pip and use pip to install packages (see NOTE).

To deactivate your current environment do:

conda deactivate

And this will return you to your usual shell environment.

NOTE on using pip in conda environments.   Once inside an conda enviroment, you can use:

conda install pip 

To install the pip package manager into your conda environment.  We recommend using:

python -m pip install <python module name>

instead of the usual

pip install <python module name>

since the former will keep all pip installed modules in your conda environment tree.

Using Conda environments in your batch jobs 

For interactive batch jobs, the procedure is the same as above when working on a cluster front-end. 

For batch jobs which require a batch script, the way to invoke the python environment is different...

For users with a tcsh default shell and using a tcsh based batch script (most users):

#!/bin/tcsh
#SBATCH --job-name=pythontest
#SBATCH -N 1 --ntasks-per-node 8
#SBATCH -t 1-0

module load miniforge3
conda activate <myenvironment>

python commands...

For users with a bash default shell and using a bash based batch script:

#!/bin/bash
#SBATCH --job-name=pythontest
#SBATCH -N 1 --ntasks-per-node 8
#SBATCH -t 1-0

module load miniforge3
eval "$(conda shell.bash hook)"
conda activate <myenvironment>

python commands...

 

For users with a tcsh default shell and using bash in their batch scripts: 

#!/bin/bash
#SBATCH --job-name=pythontest
#SBATCH -N 1 --ntasks-per-node 8
#SBATCH -t 1-0

source /usr/local/etc/sciclone.bashrc
module load miniforge3
eval "$(conda shell.bash hook)"
conda activate <myenvironment>

python commands...

 

Setting up a Python virtual environment:  venv

The alternative to Conda style python environments is to use Python's virtual environment approach.   There is an environment module on all HPC clusters named:  python/3.12.7 that can be used.  Note, all linux systems come with python installed for system use.   Please use the python/3.12.7 module since changes to the OS occur often and could result in your workflow breaking.   To load the python/3.12.7 module:

module load python/3.12.7

Next, create a new virtual environment:

python -m venv testenv

Here, testenv is the name of the venv, python will make a new directory named after this.

To activate the new venv, enter:

source testenv/bin/activate.csh

for tcsh, or 

source testenv/bin/activate

for bash.

Once the python venv is activated, your prompt will change to prepend your venv name in front of your prompt.   From here, you can now pip install any python packages.   Then, you can activate this venv again when it is needed.

To deactivate a python virtual environment, the command:

deactivate

will deactivate the python venv.

Using Python venv in your batch jobs 

For interactive batch jobs, the procedure is the same as above when working on a cluster front-end. 

For batch jobs which require a batch script, the way to invoke the python environment is the same for all shells:

#!/bin/tcsh
#SBATCH --job-name=pythontest
#SBATCH -N 1 --ntasks-per-node 8
#SBATCH -t 1-0

module load python/3.12.7
source testenv/bin/activate.csh

python commands...