Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Add Dockerfile and GitHub Actions based CI #12

Draft
wants to merge 3 commits into
base: main
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .dockerignore
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
.github
139 changes: 139 additions & 0 deletions .github/workflows/docker-debian.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,139 @@
name: Docker Images Debian

on:
push:
branches:
- main
tags:
- v*
pull_request:
branches:
- main
schedule:
- cron: '23 1 * * 0'
release:
types: [published]
workflow_dispatch:

jobs:
docker:
name: Build and publish Debian images
runs-on: ubuntu-latest

steps:
- name: Checkout
uses: actions/checkout@v2
with:
fetch-depth: 0
- name: Prepare
id: prep
run: |
DOCKER_IMAGE=madanalysis5/madanalysis5
VERSION=latest
MADANALYSIS_VERISON=v1.10.0
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The version number will need to get bumped during releases (might want to look into things like bump2version for the whole project to make this easier/automatic).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @matthewfeickert I'll definitely look into this, it makes total sense to automatize the versioning.

REPO_NAME=${{github.repository}}
REPO_NAME_LOWERCASE="${REPO_NAME,,}"
if [[ $GITHUB_REF == refs/tags/* ]]; then
VERSION=${GITHUB_REF#refs/tags/}
elif [[ $GITHUB_REF == refs/pull/* ]]; then
VERSION=pr-${{ github.event.number }}
fi
TAGS="${DOCKER_IMAGE}:${VERSION}"
TAGS="$TAGS,${DOCKER_IMAGE}:latest,${DOCKER_IMAGE}:${MADANALYSIS_VERISON},${DOCKER_IMAGE}:sha-${GITHUB_SHA::8}"
# Releases also have GITHUB_REFs that are tags, so reuse VERSION
if [ "${{ github.event_name }}" = "release" ]; then
TAGS="ghcr.io/${REPO_NAME_LOWERCASE}:latest,ghcr.io/${REPO_NAME_LOWERCASE}:latest-stable,ghcr.io/${REPO_NAME_LOWERCASE}:${MADANALYSIS_VERISON},ghcr.io/${REPO_NAME_LOWERCASE}:sha-${GITHUB_SHA::8}"
fi
echo ::set-output name=version::${VERSION}
echo ::set-output name=tags::${TAGS}
echo ::set-output name=created::$(date -u +'%Y-%m-%dT%H:%M:%SZ')
echo ::set-output name=repo_name_lowercase::"${REPO_NAME_LOWERCASE}"
echo ::set-output name=MADANALYSIS_VERISON::"${MADANALYSIS_VERISON}"

- name: Set up QEMU
uses: docker/setup-qemu-action@v1

- name: Set up Docker Buildx
uses: docker/setup-buildx-action@v1

# - name: Login to DockerHub
# if: github.event_name != 'pull_request'
# uses: docker/login-action@v1
# with:
# username: ${{ secrets.DOCKERHUB_USERNAME }}
# password: ${{ secrets.DOCKERHUB_TOKEN }}
Comment on lines +59 to +64
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Publishing to DockerHub requires establishing an org and additional authentication. I left this in in case this is of interest in the near future, but this can easily get removed if people are fine with using GitHub Container Registry (ghcr) for the time being.


- name: Login to GitHub Container Registry
if: github.event_name != 'pull_request'
uses: docker/login-action@v1
with:
registry: ghcr.io
username: ${{ github.repository_owner }}
password: ${{ secrets.GITHUB_TOKEN }}

- name: Test build
id: docker_build_test
uses: docker/build-push-action@v2
with:
context: .
file: docker/Dockerfile
tags: ${{ steps.prep.outputs.tags }}
labels: |
org.opencontainers.image.source=${{ github.event.repository.html_url }}
org.opencontainers.image.created=${{ steps.prep.outputs.created }}
org.opencontainers.image.revision=${{ github.sha }}
load: true
push: false

- name: Image digest
run: echo ${{ steps.docker_build_test.outputs.digest }}

- name: List built images
run: docker images

- name: List ma5 users settings
run: >-
docker run --rm
madanalysis5/madanalysis5:sha-${GITHUB_SHA::8}
'cat $(find / -name "installation_options.dat")'

- name: Run test program
run: >-
docker run --rm
-v $PWD:$PWD
madanalysis5/madanalysis5:sha-${GITHUB_SHA::8}
'cat $(find /root/ -iname "differential_xsec_plot.ma5") | ma5 && tree differential_xsec_example'

- name: Build and publish to registry
# every PR will trigger a push event on main, so check the push event is actually coming from main
if: github.event_name == 'push' && github.ref == 'refs/heads/main' && github.repository == 'MadAnalysis/madanalysis5'
id: docker_build_latest
uses: docker/build-push-action@v2
with:
context: .
file: docker/Dockerfile
tags: |
ghcr.io/${{ steps.prep.outputs.repo_name_lowercase }}:latest
labels: |
org.opencontainers.image.source=${{ github.event.repository.html_url }}
org.opencontainers.image.created=${{ steps.prep.outputs.created }}
org.opencontainers.image.revision=${{ github.sha }}
push: true

- name: Build and publish to registry with release tag
if: github.event_name == 'release' && github.event.action == 'published' && github.repository == 'MadAnalysis/madanalysis5'
id: docker_build_release
uses: docker/build-push-action@v2
with:
context: .
file: docker/Dockerfile
tags: |
ghcr.io/${{ steps.prep.outputs.repo_name_lowercase }}:latest
ghcr.io/${{ steps.prep.outputs.repo_name_lowercase }}:latest-stable
ghcr.io/${{ steps.prep.outputs.repo_name_lowercase }}:${{ steps.prep.outputs.MADANALYSIS_VERISON }}
ghcr.io/${{ steps.prep.outputs.repo_name_lowercase }}:sha-${GITHUB_SHA::8}
labels: |
org.opencontainers.image.source=${{ github.event.repository.html_url }}
org.opencontainers.image.created=${{ steps.prep.outputs.created }}
org.opencontainers.image.revision=${{ github.sha }}
push: true
64 changes: 64 additions & 0 deletions docker/Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,64 @@
ARG BUILDER_IMAGE=python:3.9-slim-bullseye
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This image doesn't include ROOT to make things smaller. If the dev team would prefer to have a base image with ROOT I can do that. Also, I'm assuming the dev team doesn't care if the base image is Debian or CentOS based, but if for some reason it does matter let me know.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @matthewfeickert, thanks for all this! Having root would be great, that is generally the most problematic part for the users and our public analysis database mostly consist of Delphes based analyses for the time being so root is quite essential. The base is not so important ma5 has its own interface and the user doesn't need to know much about shell so we can handle the installation of the tools in the background. Also, you don't need to worry about the update notes I can take care of that, you are right it needs updating.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Having root would be great, that is generally the most problematic part for the users and our public analysis database mostly consist of Delphes based analyses for the time being so root is quite essential.

Okay cool. 👍 I'll do some revision and then rebase this PR with one of the ATLAS AMG lab images that contain ROOT (I maintain these to try to be decently small images, but you can be the judge on how big is "too big" for an image with madanalysis5).

FROM ${BUILDER_IMAGE} as builder

USER root
WORKDIR /

SHELL [ "/bin/bash", "-c" ]

COPY . /root/madanalysis5

# Set PATH to pickup virtualenv by default
ENV PATH=/usr/local/venv/bin:"${PATH}"
RUN apt-get -qq -y update && \
apt-get -qq -y install --no-install-recommends \
gcc \
g++ \
make \
zlib1g \
bash-completion \
python3-dev \
less \
tree \
wget \
curl \
gnuplot \
git && \
apt-get -y clean && \
apt-get -y autoremove && \
rm -rf /var/lib/apt/lists/* && \
python -m venv /usr/local/venv && \
. /usr/local/venv/bin/activate && \
python -m pip --no-cache-dir install --upgrade pip setuptools wheel && \
python -m pip --no-cache-dir install pip-tools && \
python -m pip list && \
python -m piptools compile \
--generate-hashes \
--output-file /root/madanalysis5/requirements.lock \
/root/madanalysis5/docker/requirements.txt && \
python -m pip --no-cache-dir install --upgrade --requirement /root/madanalysis5/requirements.lock && \
Comment on lines +35 to +39
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using piptools here to create a hash-level lockfile (/root/madanalysis5/requirements.lock) so that the Python environment that is shipped is fully reproducible and explained (piptools compile will add comments to the generated lockfile explaining why things were added).

python -m pip list && \
export PATH="$(find / -type d -iname madanalysis5)/bin:${PATH}" && \
python -c 'import multiprocessing; print(multiprocessing.cpu_count())' | ma5 && \
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This works during the build, but when the image is run as a container later, ma5 will warn that

MA5: Checking the MadAnalysis 5 core library:
MA5:   => System configuration has changed since the last use. Need to rebuild the library.

I'm not sure if there is a way to set defaults that will be general enough that a Docker image user won't have to have this compile step happen every time. Thoughts?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll look into this. It's part of the legacy code; it has been built to be sensitive to any change.

printf '\nexport PATH=/usr/local/venv/bin:"${PATH}"\n' >> /root/.bashrc && \
printf '\nexport PATH='"$(find / -type d -iname madanalysis5)/bin"':"${PATH}"\n' >> /root/.bashrc

# Enable tab completion by uncommenting it from /etc/bash.bashrc
# The relevant lines are those below the phrase "enable bash completion in interactive shells"
RUN export SED_RANGE="$(($(sed -n '\|enable bash completion in interactive shells|=' /etc/bash.bashrc)+1)),$(($(sed -n '\|enable bash completion in interactive shells|=' /etc/bash.bashrc)+7))" && \
sed -i -e "${SED_RANGE}"' s/^#//' /etc/bash.bashrc && \
unset SED_RANGE

# Use C.UTF-8 locale to avoid issues with ASCII encoding
ENV LC_ALL=C.UTF-8
ENV LANG=C.UTF-8

# Default user is root to avoid uid write permission problems with volumes
ENV HOME /root
WORKDIR ${HOME}/data

ENV PATH="${HOME}/.local/bin:${PATH}"
ENV PATH="/root/madanalysis5/bin:${PATH}"

ENTRYPOINT ["/bin/bash", "-l", "-c"]
CMD ["/bin/bash"]
4 changes: 4 additions & 0 deletions docker/requirements.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
six>=1.16.0 # required by ma5
scipy>=1.7.0 # optional for reinterpretation
pyhf>=0.6.3 # optional for reinterpretation
matplotlib>=3.5.0 # optional for histogramming
Comment on lines +1 to +4
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Including all of these as they are relatively lightweight dependencies.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is perfect thanks I can add more as needed! also do you know if the latex and pdflatex compilers should be downloaded separately? I guess if you managed to execute the example they are already in the machine by default?

25 changes: 25 additions & 0 deletions examples/differential_xsec_plot.ma5
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
# differential cross-section plot
# c.f. http://madanalysis.irmp.ucl.ac.be/wiki/FAQNormalMode
Comment on lines +1 to +2
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The main motivation for adding this is so that the Docker image build in CI has something to test against, but it seems to be a simple example to add also.


# install samples
install samples

# load sample and set cross-section
import /root/madanalysis5/samples/zz.lhe.gz as my_sample
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This does have the disadvantage of being location specific, unless there is a way to import things starting from a relative path given by a shell variable.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately, this depends on where you execute ma5 at the moment. The relative path is possible there is an anchor handle which can be set by

ma5> set main.currentdir = ...

this command is by default the execution location and everything can be added relative to this position. Once it is changed the relative paths must change as well. So for instance you executed from madanalysis5 folder

$ cd madanalysis5
$ ./bin/ma5
ma5> install samples
ma5> import samples/zz.lhe.gz as my_sample

should work since it attaches currentdir anchor at the beginning of the path string by default.

set my_sample.xsection = 123

# plot transverse momentum for leading positron
plot PT(e+[1])

# Note that automatically this will create a selection
display selection[1]

# change the luminosity value to get the cross-section
set main.lumi = 1e-3 * 10

# Format the plot
set selection[1].titleY = "$\\frac{d\\sigma}{dp_{T}}\ {\\rm [pb/GeV]}$" # matplotlib formatting
# set selection[1].titleY = "#frac{d#sigma}{dp_{T}} [pb/GeV]" # ROOT formatting

# Run an analysis to make the plot
submit differential_xsec_example