Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Investigate docker image usage and production in Tezos Cluster #55

Open
tmcgilchrist opened this issue Jun 20, 2022 · 0 comments
Open
Assignees

Comments

@tmcgilchrist
Copy link
Contributor

Background

Each of the workers in the Tezos Cluster keeps a local cache via docker of the images it uses.
When this cache becomes too full the individual worker pauses and runs docker system prune to free up space. Currently this prune is taking 4 hours on a worker, effectively taking out a worker for that entire time.

On top of missing 1 worker for 4 hours each time, the Octez pipeline seems to produce many large docker images (10Gb or more) as both input and output of the pipeline. We need to understand and document why that is and whether they are all necessary.

Solution

Some possible solutions to try (in rough order of suitability):

  • Run a nightly job to prune the docker cache, when the cluster is less busy
  • Add extra worker to allow for pruning time
  • Cleanup docker images that get produced by the pipeline, if they're not being published somewhere.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants