Python use on HPC clusters
NOTE
Using this help relies on you understanding the use of software environment modules and what type of shell you are using. You can use the command echo $0
to check which shell you use (tcsh or bash), e.g.:
>> echo $0
-tcsh
This shows that I use a tcsh shell for my environment (the vast majority of HPC users get tcsh)
Introduction: Python and Anaconda
Python is an extermely popular programming language that is ubiquitous in research computing. Depending on your use case, you may need to just run a simple python command or you may need to build a complex python environment with many separate, dependent python modules.
Anaconda is a distribution of python that comes with its own package manager. Anaconda (or Conda). The package manager can be used to obtain python modules from multiple sources. Just like usual python, conda also can create environments for independent sets of python modules. We recommend Anaconda for users getting started with python or are not already familiar with python virtual environments.
Setting up your environment
Before you begin computing on HPC, you must choose which subcluster you wish to use. For the purposes of this documentation, we will use bora/hima subclusters (the bora and hima subclusters share the same front-end, namely, bora).
To create an Anaconda environment for bora and/or hima, log into the bora front-end. For tcsh users, Anaconda can be added to your environment by loading an appropriate environment module:
module load anaconda3/2023.09
FOR BASH users:
Unfortunately, Anaconda does not activate properly using the module system within the bash environment. Therefore, after loading the anaconda environment module, one additional step is needed. You will need to enter:
eval "$(conda shell.bash hook)"
On the command-line before you want to activate an Anaconda environment.
NOTE: Unlike the Anaconda documentation, we don't recommend running
on the HPC cluster since this can disturb your regular shell environment. |
Creating the Anaconda environment
Once Anaconda is properly loaded, you can use:
conda create -n <env name>
to create your environment
The new environment can be activated using
conda activate <env name>
once in the environment, you can use conda install to install packages (see conda doc) or install pip and use pip to install packages (see NOTE).
To deactivate your current environment do:
conda deactivate
And this will return you to your usual shell environment.
NOTE on using pip in Anaconda environments. Once inside an Anaconda enviroment, you can use:
conda install pip
To install the pip package manager into your Anaconda environment. We recommend using:
python -m pip install <python module name>
instead of the usual
pip install <python module name>
since the former will keep all pip installed modules in your Anaconda environment tree.
Using Python/Anaconda in your batch jobs
For interactive batch jobs, the procedure is the same as above when working on a cluster front-end.
For batch jobs, simply put the module load and conda activate commands in your batch script before running your python script: