Skip to content

Commit

Permalink
docs: document the resource requirements
Browse files Browse the repository at this point in the history
Signed-off-by: Matej Focko <[email protected]>
  • Loading branch information
mfocko committed Jul 22, 2024
1 parent c7ab10e commit 013e797
Showing 1 changed file with 93 additions and 0 deletions.
93 changes: 93 additions & 0 deletions docs/deployment/resource-requirements.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,93 @@
---
title: Resource requirements
---

Usual Packit Service deployment consists of the following services with these
resource requirements.

## CPU requirements

| Deployment | Requested (always assigned) | Limit |
| ---------------- | ---------------------------: | -----: |
| postgres | `30m` | `1` |
| redict | `10m` | `10m` |
| flower | `5m` | `50m` |
| nginx | `5m` | `10m` |
| pushgateway | `5m` | `10m` |
| tokman | `20m` (prod, `5m` otherwise) | `50m` |
| dashboard | `5m` | `50m` |
| fedmsg | `5m` | `50m` |
| beat | `5m` | `50m` |
| service | `10m` | `200m` |
| worker (generic) | `100m` | `400m` |
| worker (short) | `80m` | `400m` |
| worker (long) | `100m` | `600m` |

## Memory requirements

| Deployment | Requested (always assigned) | Limit |
| ---------------- | -------------------------------: | ---------------------------------: |
| postgres | `1Gi` (prod, `256Mi` otherwise) | `1536Mi` (prod, `512Mi` otherwise) |
| redict | `128Mi` | `256Mi` |
| flower | `80Mi` | `128Mi` |
| nginx | `8Mi` | `32Mi` |
| pushgateway | `16Mi` | `32Mi` |
| tokman | `100Mi` (prod, `88Mi` otherwise) | `160Mi` (prod, `128Mi` otherwise) |
| dashboard | `128Mi` | `256Mi` |
| fedmsg | `88Mi` | `128Mi` |
| beat | `160Mi` | `256Mi` |
| service | `320Mi` | `1Gi` (prod, `512Mi` otherwise) |
| worker (generic) | `384Mi` | `1024Mi` |
| worker (short) | `320Mi` | `640Mi` |
| worker (long) | `384Mi` | `1024Mi` |

## Reasoning for the requirements

**\[TODO\]**

## Currently allowed requirements / limits

| Resource | Allowed to request | Limit |
| -------- | -----------------: | ----: |
| CPU | `3` | `12` |
| Memory | `6Gi` | `8Gi` |

## Total for production

| Deployment | Memory request | Memory limit | CPU request | CPU limit |
| ---------------- | -------------: | -----------: | ----------: | ----------: |
| non-scalable[^1] | `2052Mi` | `3808Mi` | `100m` | `1280m` |
| 2× short worker | `640Mi` | `1280Mi` | `160m` | `800m` |
| 2× long worker | `768Mi` | `2048Mi` | `200m` | `1000m` |
| **Σ** | **`3460Mi`** | **`7136Mi`** | **`460m`** | **`3080m`** |

## Proposed changes

1. Revert to the pre-MP+ resources (they were higher for service, workers and
postgres; lower values were used due to a hardcoded check in the templates);

Pre-MP+ memory requirements/limits for production deployment:

| Deployment | Requested | Limit |
| ---------- | --------: | ----: |
| postgres | `2Gi` | `4Gi` |
| service | `320m` | `4Gi` |

1. Request adjustments of the quotas such that we can have some buffer (database
migrations, higher load on service, etc.), but also could **permanently**
scale up the workers if we find service to be more reliable that way

- Based on the calculations above, 2× the current quotas on memory would be
sufficient, but if we were to scale the workers up too (and account for
possible adjustments, e.g., Redict) we should probably go for 3×
- **\[TODO\]** Also check how such changes affect the CPU requests/limits

1. Migrate tokman to different toolchain, it's a small self-contained app, so it
is easy to migrate to either Rust or Go that should leave smaller footprint.
- **\[TODO\]** Also research the possibility of dropping it, since GitHub
might've changed their policy about GitHub App tokens; this was discovered
by Hunor, but we haven't tried dropping it…

[^1]:
includes non-scalable deployments, i.e., each runs just one pod, e.g.,
dashboard, redict, postgres, etc.

0 comments on commit 013e797

Please sign in to comment.