Supported HPC systems
AI4HPC currently supports 6 HPC systems:
To define new HPC systems: setup.py file should include new configurations in the class HPC_ini():. In parallel, an individual setup_<system_name>.sh script to ./Scripts/Setup folder should be added. Examples are given for JUWELS, JURECA, DEEP-EST, LUMI and CTE-AMD.
CTE-AMD preinstallation
As CTE-AMD does not allow incoming/outgoing communication, a workaround is to use a local system and mount BSC’s dedicated Data Transfer Machine (dt01.bsc.es or dt02.bsc.es). Using this method, the compilation of AI4HPC requires first issuing the following commands in a local machine:
$ wget https://gitlab.jsc.fz-juelich.de/CoE-RAISE/FZJ/ai4hpc/ai4hpc/Scripts/CTE-AMD/{getDeps_amdlogin.sh,reqs_dep.txt,reqs_ind.txt}
$ sh getDeps_amdlogin.sh
and type in your CTE-AMD username when prompt on screen. This script transfers AI4HPC and its dependencies to the user’s directory of CTE-AMD in folders named ai4hpc and wheels_ai4hpc, respectively. From that point, the user can freely move ai4hpc folder to their working directory, and continue the standard installation steps.
Note that the current stage of CTE-AMD does not include the neccesary Python version. For this purpose, please use the ./Scripts/CTE-AMD/installPython_amdlogin.sh script.
DeepSpeed installation in Leonardo
Since the login nodes in Leonardo does not feature GPUs, DeepSpeed has to be installed using one of the compute nodes having GPUs, as DeepSpeed requires CUDA runtime. The installation steps are provided in ./Scripts/Leonardo/.