feat: add bench workflow for AWS #1330

tschneider-aneo · 2024-10-31T18:01:55Z

Motivation

Related to ArmoniK benchmarking automation project, Bench application is the most suited one for testing the ArmoniK's sheer orchestration performance

Description

Use Bench action from Armonik.Action.Deploy to benchmark ArmoniK with Bench application on localhost on each commit on main, and on AWS at each release.

There's also a possibility to launch this workflow manually, with an option to choose whether the infrastructure must be automatically destroyed after the session is finished. This can be useful to perform post-mortem analysis.

It introduces a Python script that is meant to retrieve the throughput of a session and return a JSON containing its value.
It also introduces a infrastructure of reference for ArmoniK benchmarks.

Testing

This workflow was run on every push on the PR's branch.

Impact

This PR implements a milestone for ArmoniK benchmarking capacity, as AK would be benchmarked frequently.

Additional Information

Needs aneoconsulting/Armonik.Action.Deploy#24 to be merged.

Sometimes infrastructure destruction fails on AWS due to a deadlock between a security group and a subnet that happens unpredictably.

When this happens, the destruction must be taken over manually by destroying the security group and the network interface associated with it, and finished with make recipes destroy and bootstrap-destroy with the prefix used by the GitHub workflow (currently benchmark).

Checklist

My code adheres to the coding and style guidelines of the project.
I have performed a self-review of my code.
I have commented my code, particularly in hard-to-understand areas.
I have made corresponding changes to the documentation.
I have thoroughly tested my modifications and added tests when necessary.
Tests pass locally and in the CI.
I have assessed the performance impact of my modifications.

tools/ci/bench-job-template.yml

+  ttlSecondsAfterFinished: 0
+  template:
+    spec:
+      containers:


tools/ci/bench-job-template.yml

+  template:
+    spec:
+      containers:
+        - name: bench-session


tools/ci/bench-job-template.yml

.github/workflows/bench-aws.yml

qdelamea-aneo · 2024-11-21T13:05:04Z

.github/workflows/bench-aws.yml

+          AWS_SECRET_ACCESS_KEY: ${{ secrets.AWS_SECRET_ACCESS_KEY }}
+          AWS_EC2_METADATA_DISABLED: true
+        run: |
+          aws s3 cp "$BENCH_RESULTS_PATH" "s3://test-armonik-bench-storage/benchclient_benchmark_${EVENT_NAME}_${TYPE}_${GHRUNID}.json"


Maybe you should use an environment variable here instead of the putting directly the bucket uri

I don't get your point. Could you be a bit more explicit ?

The bucket should be configuration variable that is stored within the repo so that we can change it if needed without modifying the workflow.
See: https://docs.github.com/en/actions/writing-workflows/choosing-what-your-workflow-does/store-information-in-variables#defining-configuration-variables-for-multiple-workflows

.github/workflows/bench-aws.yml

lemaitre-aneo · 2024-12-23T07:29:36Z

.github/workflows/bench-aws.yml

+        run: |
+          set -ex
+          if [ "$TRIGGER" == 'push' ]; then
+            echo '{"include":[{"type": "localhost", "ntasks":1000, "polling-limit": 300}, {"type": "aws", "ntasks":1200000, "polling-limit": 1000, "parameters-file-path": "tools/ci/bench-aws.tfvars"}]}' > matrix.json


"ntasks":1200000`

Why such an oddly specific number?

I had session resume troubles when resuming 1.5M tasks with this infra size. I wanted to ensure we wouldn't encounter this kind of problem in the CI and 1,2M seemed to be fine.

I know it is odd to not have a million tasks but I thought 1M could be a bit short for the bench duration.

.github/workflows/bench-benchmark.yml

tools/ci/bench-aws.tfvars

qdelamea-aneo · 2025-01-06T13:55:05Z

.github/workflows/bench-benchmark.yml

+          DATE=$(date +"%Y-%m-%d")
+          aws s3 cp "$BENCH_RESULTS_PATH" "s3://armonik-bench-storage/${FILE_PREFIX}/${GHRUNID}_${DATE}/benchclient_benchmark_${EVENT_NAME}_${TYPE}.json"
+
+      - if: ${{ (github.event_name == 'workflow_dispatch' && inputs.destroy-on-session-end) || (github.event_name != 'workflow_dispatch' && always()) }}


Is the 'always()' really necessary?

always() is here to ensure that if the workflow isn't triggered manually the infrastructure on AWS is always destroyed even when a previous step has failed.

qdelamea-aneo · 2025-01-06T13:59:44Z

tools/ci/bench-job-template.yml

+            - name: BenchOptions__PayloadSize
+              value: "1"
+            - name: BenchOptions__ResultSize
+              value: "1"


Is it possible to put zero here?

Never tested ! Worth trying

sonarqubecloud · 2025-01-08T13:01:30Z

Quality Gate passed

Issues
6 New issues
0 Accepted issues

Measures
1 Security Hotspot
0.0% Coverage on New Code
0.0% Duplication on New Code

See analysis details on SonarQube Cloud

tschneider-aneo force-pushed the ts/add-bench-aws branch from cdd4d48 to 6e32c34 Compare November 4, 2024 13:44

github-advanced-security bot found potential problems Nov 5, 2024

View reviewed changes

tschneider-aneo force-pushed the ts/add-bench-aws branch 4 times, most recently from ee84b2d to e5e9472 Compare November 6, 2024 14:25

tschneider-aneo force-pushed the ts/add-bench-aws branch 3 times, most recently from 701ae60 to e767835 Compare November 20, 2024 16:46

qdelamea-aneo requested changes Nov 21, 2024

View reviewed changes

tschneider-aneo mentioned this pull request Dec 19, 2024

feat: capacity to parametrize TF parameters file at infra deployment and destruction aneoconsulting/Armonik.Action.Deploy#24

Merged

tschneider-aneo force-pushed the ts/add-bench-aws branch from 54bb3d7 to 82b14d7 Compare December 20, 2024 09:30

lemaitre-aneo reviewed Dec 23, 2024

View reviewed changes

tschneider-aneo force-pushed the ts/add-bench-aws branch from e3e1c38 to 4296079 Compare December 23, 2024 11:29

tschneider-aneo marked this pull request as ready for review December 23, 2024 11:32

lemaitre-aneo reviewed Dec 27, 2024

View reviewed changes

.github/workflows/bench-benchmark.yml Outdated Show resolved Hide resolved

tools/ci/bench-aws.tfvars Outdated Show resolved Hide resolved

tools/ci/bench-aws.tfvars Outdated Show resolved Hide resolved

tschneider-aneo force-pushed the ts/add-bench-aws branch from e1a8b2d to 8ac0e20 Compare December 27, 2024 16:59

qdelamea-aneo previously approved these changes Jan 6, 2025

View reviewed changes

tschneider-aneo dismissed qdelamea-aneo’s stale review via 00fc251 January 8, 2025 09:13

tschneider-aneo force-pushed the ts/add-bench-aws branch from 8ac0e20 to 00fc251 Compare January 8, 2025 09:13

feat: Add bench workflow

c5e410a

tschneider-aneo force-pushed the ts/add-bench-aws branch from 00fc251 to c5e410a Compare January 8, 2025 13:01

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add bench workflow for AWS #1330

feat: add bench workflow for AWS #1330

tschneider-aneo commented Oct 31, 2024 •

edited

Loading

qdelamea-aneo Nov 21, 2024

tschneider-aneo Dec 19, 2024 •

edited

Loading

qdelamea-aneo Jan 6, 2025

lemaitre-aneo Dec 23, 2024

tschneider-aneo Dec 23, 2024 •

edited

Loading

qdelamea-aneo Jan 6, 2025

tschneider-aneo Jan 6, 2025

qdelamea-aneo Jan 6, 2025

tschneider-aneo Jan 6, 2025

sonarqubecloud bot commented Jan 8, 2025

feat: add bench workflow for AWS #1330

Are you sure you want to change the base?

feat: add bench workflow for AWS #1330

Conversation

tschneider-aneo commented Oct 31, 2024 • edited Loading

Motivation

Description

Testing

Impact

Additional Information

Checklist

qdelamea-aneo Nov 21, 2024

Choose a reason for hiding this comment

tschneider-aneo Dec 19, 2024 • edited Loading

Choose a reason for hiding this comment

qdelamea-aneo Jan 6, 2025

Choose a reason for hiding this comment

lemaitre-aneo Dec 23, 2024

Choose a reason for hiding this comment

tschneider-aneo Dec 23, 2024 • edited Loading

Choose a reason for hiding this comment

qdelamea-aneo Jan 6, 2025

Choose a reason for hiding this comment

tschneider-aneo Jan 6, 2025

Choose a reason for hiding this comment

qdelamea-aneo Jan 6, 2025

Choose a reason for hiding this comment

tschneider-aneo Jan 6, 2025

Choose a reason for hiding this comment

sonarqubecloud bot commented Jan 8, 2025

Quality Gate passed

tschneider-aneo commented Oct 31, 2024 •

edited

Loading

tschneider-aneo Dec 19, 2024 •

edited

Loading

tschneider-aneo Dec 23, 2024 •

edited

Loading