Goal: compile definition files of containers for AI use cases. Also provide documentation on how to use them on our HPC systems.
We use Apptainer to build/run containers on our HPC systems. You will need a Linux system to run Apptainer natively on your machine, and it’s easiest to install if you have root access.
But it is also easy to use or convert docker images with Apptainer.
For a nice introduction to Apptainer on our HPC systems, have a look at the awesome presentation by Michele. You can also browse our documentation.
Containers are built via a definition file and the apptainer build
command.
In each folder of this repo you will find a definition .def
file and a README.md
that describes the exact build command.
A nice workflow to develop a python library locally and deploy it on our HPO systems (sharing exactly the same environment) is to use the sandbox feature of Apptainer.
We are still investigating if someting similar is possible with Docker
(please let us know if you find a way :) ).
In the root directory of your library (repository) create a definition *.def
file.
This definition file should reflect your environment in which you want your library to develop and use.
You can leverage base environments, such as docker images on DockerHub, or existing apptainers.
Build the sandbox (container in a directory) instead of the default SIF format:
apptainer build --fakeroot --sandbox my_container my_container.def
Now we can add our library that we develop to the sandbox environment and install it in editable
mode:
apptainer exec --writable my_container python -m pip install -e .
You should be able to point the interpreter of your IDE (VSC, PyCharm, etc.) to the python executable inside the sandbox folder.
While in principle you could build a SIF container directly from your sandbox, it is better to modify your definition *.def
file to include your library/package.
In this way, your container is fully reproducible using only the definition file.
Once you built the SIF container, you can copy it to our HPC systems and use it there.
apptainer build --fakeroot my_container.sif my_container.def
TODO:
- how to run the containers on our SLURM cluster
- mention important flags, like
--nv
for example