-
Notifications
You must be signed in to change notification settings - Fork 18
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Add Dockerfile and GitHub Actions based CI #12
base: main
Are you sure you want to change the base?
feat: Add Dockerfile and GitHub Actions based CI #12
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jackaraz I left some comments on this PR to make it easier for review, but before I actually mark it as such and flag you all for one, can you give me advice on the question RE: madanalysis/UpdateNotes.txt
format?
@@ -0,0 +1,64 @@ | |||
ARG BUILDER_IMAGE=python:3.9-slim-bullseye |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This image doesn't include ROOT to make things smaller. If the dev team would prefer to have a base image with ROOT I can do that. Also, I'm assuming the dev team doesn't care if the base image is Debian or CentOS based, but if for some reason it does matter let me know.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @matthewfeickert, thanks for all this! Having root would be great, that is generally the most problematic part for the users and our public analysis database mostly consist of Delphes based analyses for the time being so root is quite essential. The base is not so important ma5 has its own interface and the user doesn't need to know much about shell so we can handle the installation of the tools in the background. Also, you don't need to worry about the update notes I can take care of that, you are right it needs updating.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Having root would be great, that is generally the most problematic part for the users and our public analysis database mostly consist of Delphes based analyses for the time being so root is quite essential.
Okay cool. 👍 I'll do some revision and then rebase this PR with one of the ATLAS AMG lab images that contain ROOT (I maintain these to try to be decently small images, but you can be the judge on how big is "too big" for an image with madanalysis5
).
six>=1.16.0 # required by ma5 | ||
scipy>=1.7.0 # optional for reinterpretation | ||
pyhf>=0.6.3 # optional for reinterpretation | ||
matplotlib>=3.5.0 # optional for histogramming |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Including all of these as they are relatively lightweight dependencies.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is perfect thanks I can add more as needed! also do you know if the latex and pdflatex compilers should be downloaded separately? I guess if you managed to execute the example they are already in the machine by default?
python -m piptools compile \ | ||
--generate-hashes \ | ||
--output-file /root/madanalysis5/requirements.lock \ | ||
/root/madanalysis5/docker/requirements.txt && \ | ||
python -m pip --no-cache-dir install --upgrade --requirement /root/madanalysis5/requirements.lock && \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using piptools here to create a hash-level lockfile (/root/madanalysis5/requirements.lock
) so that the Python environment that is shipped is fully reproducible and explained (piptools compile
will add comments to the generated lockfile explaining why things were added).
# differential cross-section plot | ||
# c.f. http://madanalysis.irmp.ucl.ac.be/wiki/FAQNormalMode |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The main motivation for adding this is so that the Docker image build in CI has something to test against, but it seems to be a simple example to add also.
install samples | ||
|
||
# load sample and set cross-section | ||
import /root/madanalysis5/samples/zz.lhe.gz as my_sample |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This does have the disadvantage of being location specific, unless there is a way to import things starting from a relative path given by a shell variable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unfortunately, this depends on where you execute ma5
at the moment. The relative path is possible there is an anchor handle which can be set by
ma5> set main.currentdir = ...
this command is by default the execution location and everything can be added relative to this position. Once it is changed the relative paths must change as well. So for instance you executed from madanalysis5 folder
$ cd madanalysis5
$ ./bin/ma5
ma5> install samples
ma5> import samples/zz.lhe.gz as my_sample
should work since it attaches currentdir
anchor at the beginning of the path string by default.
python -m pip --no-cache-dir install --upgrade --requirement /root/madanalysis5/requirements.lock && \ | ||
python -m pip list && \ | ||
export PATH="$(find / -type d -iname madanalysis5)/bin:${PATH}" && \ | ||
python -c 'import multiprocessing; print(multiprocessing.cpu_count())' | ma5 && \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This works during the build, but when the image is run as a container later, ma5
will warn that
MA5: Checking the MadAnalysis 5 core library:
MA5: => System configuration has changed since the last use. Need to rebuild the library.
I'm not sure if there is a way to set defaults that will be general enough that a Docker image user won't have to have this compile step happen every time. Thoughts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll look into this. It's part of the legacy code; it has been built to be sensitive to any change.
run: | | ||
DOCKER_IMAGE=madanalysis5/madanalysis5 | ||
VERSION=latest | ||
MADANALYSIS_VERISON=v1.10.0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The version number will need to get bumped during releases (might want to look into things like bump2version
for the whole project to make this easier/automatic).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @matthewfeickert I'll definitely look into this, it makes total sense to automatize the versioning.
# - name: Login to DockerHub | ||
# if: github.event_name != 'pull_request' | ||
# uses: docker/login-action@v1 | ||
# with: | ||
# username: ${{ secrets.DOCKERHUB_USERNAME }} | ||
# password: ${{ secrets.DOCKERHUB_TOKEN }} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Publishing to DockerHub requires establishing an org and additional authentication. I left this in in case this is of interest in the near future, but this can easily get removed if people are fine with using GitHub Container Registry (ghcr) for the time being.
Build struture assumes running docker build from the top level of the repository. e.g. docker build --file docker/Dockerfile --tag madanalysis5/madanalysis5:debug-local . Add .dockerignore
Run tests builds on: * push events to main or tags * pull request events targetting main * on CRON schedule weekly at 01:23 UTC on Sundays * on releases * on demand with workflow dispatch Publish to the GitHub Container Registry by default, and publish with latest tag on pushed to main and publish with version tags on GitHub releases.
8a6a44b
to
5ff8b5a
Compare
Before submitting
Please complete the following checklist when submitting a PR:
If you've fixed a bug or added code that should be tested, add a
test sequence below for it to be checked by other developers.
If you do make documentation changes, make sure that the docs build and
render correctly by running
./doc/makedoc
.madanalysis/UpdateNotes.txt
file, summarizing thechange, and including a link back to the PR.
The format of
madanalysis5/madanalysis/UpdateNotes.txt
Lines 1 to 11 in 3fd09e7
is confusing (what are the numbers in the first column)? The file also hasn't been updated since MadAnalysis5 got ported to GitHub, so I'm not sure if there is a format the dev team would like to keep.
When all the above are checked, delete everything above the dashed
line and fill in the pull request template.
Context:
Resolves #2
Description of the Change:
Dockerfile
that builds from the top level of the repository with the build contextdocker build --file docker/Dockerfile --tag madanalysis5/madanalysis5:debug-local .
Add differential cross-section plot example (taken from http://madanalysis.irmp.ucl.ac.be/wiki/FAQNormalMode) as a test example to run after the build.
Add GitHub Actions workflow to build, test, and publish Docker images.
Benefits:
Dockerfiles and images are very powerful (build) testing and debugging tools and can be exceptional for giving users a sandbox area to play in and to use for demonstrations.
Possible Drawbacks:
PRs are like puppies. While I can help give guidance on how everything works, if this isn't something that the dev team is up for maintaining I can 100% appreciate and respect that.
Related GitHub Issues:
Issue #2.