Skip to content

Commit 998d003

Browse files
authoredMar 5, 2025
fix(toil): simplify Argo deployment and debuggability (#1503)
1 parent d4b366f commit 998d003

18 files changed

+79
-125
lines changed
 

‎.github/workflows/PR.yaml

+1-1
Original file line numberDiff line numberDiff line change
@@ -106,7 +106,7 @@ jobs:
106106
107107
- name: Deploy infra to dev cluster
108108
run: |
109-
ENVIRONMENT=development TEST_MODE=true make install-argo clean-argo-config install-monitoring helm-deploy
109+
ENVIRONMENT=development TEST_MODE=true make helm-deploy
110110
sleep 10 # wait for old pods to disappear so the svc port-forward doesn't connect to them
111111
kubectl -n infra port-forward svc/infra-server-service 8443:8443 > /dev/null 2>&1 &
112112
sleep 10

‎.github/workflows/deploy.yaml

+1-1
Original file line numberDiff line numberDiff line change
@@ -71,7 +71,7 @@ jobs:
7171
gcloud container clusters get-credentials infra-${{ inputs.environment }} \
7272
--project "${PROJECT}" \
7373
--region us-west2
74-
ENVIRONMENT=${{ inputs.environment }} make install-argo clean-argo-config install-monitoring helm-deploy
74+
ENVIRONMENT=${{ inputs.environment }} make helm-deploy
7575
7676
- name: Notify infra channel about new version
7777
uses: slackapi/slack-github-action@v2.0.0

‎DEPLOYMENT.md

+10-11
Original file line numberDiff line numberDiff line change
@@ -94,17 +94,6 @@ correct tooling installed with:
9494

9595
Use the `deploy` Github action to update development or production environments with a new release.
9696

97-
### Argo Deployment
98-
99-
To install Argo workflow server, run:
100-
101-
`ENVIRONMENT=<development,production> make install-argo`
102-
103-
NOTE: This is a separate step and not a dependant chart for example to avoid too frequent Argo deployments.
104-
105-
Also note that if you plan to deploy `infra-server` to this cluster, you will need to remove the default Argo workflow controller ConfigMap with the `clean-argo-config` Make target.
106-
This is required until [ROX-20269](https://issues.redhat.com/browse/ROX-20269) is resolved.
107-
10897
### Manual deployment
10998

11099
To render a copy of the charts (for inspection), run:
@@ -156,3 +145,13 @@ The infra server logs are captured automatically by GCP.
156145
- [Logs Explorer: Production](https://cloudlogging.app.goo.gl/KqgSyE2mSq83M5Xs9)
157146

158147
Adding `jsonPayload."log-type"="audit"` to the query will filter for audit logs.
148+
149+
## Inspecting live workflows
150+
151+
You can view the UI of the Argo server by forwarding its port:
152+
153+
```bash
154+
kubectl port-forward -n argo svc/infra-server-argo-workflows-server 2746
155+
```
156+
157+
and access [http://localhost:2746](http://localhost:2746).

‎Makefile

+11-33
Original file line numberDiff line numberDiff line change
@@ -223,21 +223,27 @@ endif
223223
exit 1; \
224224
fi
225225

226+
.PHONY: helm-dependency-update
227+
helm-dependency-update:
228+
@helm dependency update chart/infra-server
229+
230+
create-namespaces:
231+
@kubectl create namespace argo >/dev/null 2>&1 || echo "namespace/argo already exists"; exit 0
232+
@kubectl create namespace monitoring >/dev/null 2>&1 || echo "namespace/monitoring already exists"; exit 0
233+
226234
## Render template
227235
.PHONY: helm-template
228-
helm-template: pre-check
236+
helm-template: pre-check helm-dependency-update create-namespaces
229237
@./scripts/deploy/helm.sh template $(VERSION) $(ENVIRONMENT) $(SECRET_VERSION)
230238

231239
## Deploy
232240
.PHONY: helm-deploy
233-
helm-deploy: pre-check
241+
helm-deploy: pre-check helm-dependency-update create-namespaces
234242
@./scripts/deploy/helm.sh deploy $(VERSION) $(ENVIRONMENT) $(SECRET_VERSION)
235-
# Pick up any eventual changes to the workflow controller configmap
236-
@make bounce-argo-pods
237243

238244
## Diff
239245
.PHONY: helm-diff
240-
helm-diff: pre-check
246+
helm-diff: pre-check helm-dependency-update create-namespaces
241247
@./scripts/deploy/helm.sh diff $(VERSION) $(ENVIRONMENT) $(SECRET_VERSION)
242248

243249
## Bounce pods
@@ -276,34 +282,6 @@ secrets-edit:
276282
secrets-revert:
277283
@./scripts/deploy/secrets.sh revert $(ENVIRONMENT) $(SECRET_VERSION)
278284

279-
##################
280-
## Dependencies ##
281-
##################
282-
.PHONY: install-argo
283-
install-argo: pre-check
284-
helm repo add argo https://argoproj.github.io/argo-helm
285-
helm upgrade \
286-
argo-workflows \
287-
argo/argo-workflows \
288-
--version 0.16.9 \
289-
--install \
290-
--create-namespace \
291-
--namespace argo
292-
293-
.PHONY: clean-argo-config
294-
clean-argo-config: pre-check
295-
kubectl delete configmap argo-workflows-workflow-controller-configmap -n argo || true
296-
297-
.PHONY: install-monitoring
298-
install-monitoring: pre-check
299-
helm dependency update chart/infra-monitoring
300-
helm upgrade prometheus-stack chart/infra-monitoring \
301-
--install \
302-
--namespace monitoring \
303-
--create-namespace \
304-
--values chart/infra-monitoring/values.yaml \
305-
--wait
306-
307285
###############
308286
## Debugging ##
309287
###############

‎README.md

+1-1
Original file line numberDiff line numberDiff line change
@@ -32,7 +32,7 @@ You may also point to a different infra server with the `--endpoint` flag.
3232
To debug the server, you need to fulfil the prerequisites first.
3333

3434
1. Have an authenticated `gcloud` CLI and GNU `sed` installed.
35-
1. Have your `KUBECONFIG` point to a cluster where Argo Workflows and the ConfigMaps and Secrets for infra are deployed. This is most easily achieved by connecting to a PR cluster or deploying infra with `ENVIRONMENT=<DEVELOPMENT,PRODUCTION> make install-argo clean-argo-config helm-deploy` to a new or local cluster. This cluster will only be used to run workflows.
35+
1. Have your `KUBECONFIG` point to a cluster where Argo Workflows and the ConfigMaps and Secrets for infra are deployed. This is most easily achieved by connecting to a PR cluster or deploying infra with `ENVIRONMENT=<DEVELOPMENT,PRODUCTION> make helm-deploy` to a new or local cluster. This cluster will only be used to run workflows.
3636
1. Run `make prepare-local-server-debugging` to set the contents of the `configuration` directory and compile the UI + CLI (for downloads).
3737

3838
Then, you can use the "Debug Server" launch configuration.

‎chart/infra-monitoring/.gitignore

-1
This file was deleted.

‎chart/infra-monitoring/.helmignore

-22
This file was deleted.

‎chart/infra-monitoring/Chart.yaml

-12
This file was deleted.

‎chart/infra-monitoring/requirements.lock

-6
This file was deleted.

‎chart/infra-server/.gitignore

+2
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,2 @@
1+
# Ignore the dependency chart archives
2+
charts/

‎chart/infra-server/Chart.yaml

+7
Original file line numberDiff line numberDiff line change
@@ -10,3 +10,10 @@ annotations:
1010
acsDemoVersion: 4.6.2
1111
automationFlavorsVersion: 0.10.43
1212
ocpCredentialsMode: Passthrough
13+
dependencies:
14+
- name: argo-workflows
15+
version: "0.45.9"
16+
repository: "https://argoproj.github.io/argo-helm"
17+
- name: kube-prometheus
18+
version: 11.1.1
19+
repository: https://charts.bitnami.com/bitnami

‎chart/infra-server/argo-values.yaml

+26
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,26 @@
1+
argo-workflows:
2+
namespaceOverride: argo
3+
server:
4+
authModes:
5+
- server
6+
7+
controller:
8+
# Default values that will apply to all Workflows from this controller, unless overridden on the Workflow-level
9+
workflowDefaults:
10+
metadata:
11+
annotations:
12+
argo: workflows
13+
spec:
14+
ttlStrategy:
15+
# Keep the workflow pods & logs available for 30 days
16+
secondsAfterCompletion: 2592000
17+
secondsAfterSuccess: 2592000
18+
secondsAfterFailure: 2592000
19+
20+
artifactRepository:
21+
archiveLogs: true
22+
gcs:
23+
bucket: rhacs-infra-artifacts
24+
serviceAccountKeySecret:
25+
name: gcs-credentials
26+
key: credentials.json

‎chart/infra-monitoring/values.yaml ‎chart/infra-server/monitoring-values.yaml

+1
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,5 @@
11
kube-prometheus:
2+
namespaceOverride: monitoring
23
operator:
34
resources:
45
limits:

‎chart/infra-server/requirements.lock

+9
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,9 @@
1+
dependencies:
2+
- name: argo-workflows
3+
repository: https://argoproj.github.io/argo-helm
4+
version: 0.45.9
5+
- name: kube-prometheus
6+
repository: https://charts.bitnami.com/bitnami
7+
version: 11.1.1
8+
digest: sha256:46322f064751933585c0985dad77996ed8ee216df0fed19ea9c40f7935f67ae7
9+
generated: "2025-03-05T10:30:52.962223+01:00"
Original file line numberDiff line numberDiff line change
@@ -1,45 +1,9 @@
11
---
2-
3-
apiVersion: v1
4-
kind: ConfigMap
5-
6-
metadata:
7-
name: argo-workflows-workflow-controller-configmap
8-
namespace: argo
9-
10-
data:
11-
config: |
12-
artifactRepository:
13-
archiveLogs: true
14-
gcs:
15-
bucket: rhacs-infra-artifacts
16-
serviceAccountKeySecret:
17-
name: gcs-credentials
18-
key: credentials.json
19-
20-
# Default values that will apply to all Workflows from this controller, unless overridden on the Workflow-level
21-
workflowDefaults:
22-
metadata:
23-
annotations:
24-
argo: workflows
25-
spec:
26-
ttlStrategy:
27-
# Keep the workflow pods & logs available for 30 days
28-
secondsAfterCompletion: 2592000
29-
secondsAfterSuccess: 2592000
30-
secondsAfterFailure: 2592000
31-
32-
---
33-
342
apiVersion: v1
353
kind: Secret
36-
374
metadata:
385
name: gcs-credentials
396
namespace: default
40-
417
data:
428
credentials.json: |-
439
{{ required ".Values.google_credentials_json is undefined" .Values.google_credentials_json }}
44-
45-
---

‎chart/infra-server/templates/monitoring/servicemonitors.yaml

+3
Original file line numberDiff line numberDiff line change
@@ -7,6 +7,9 @@ metadata:
77
spec:
88
endpoints:
99
- port: metrics
10+
scheme: https
11+
tlsConfig:
12+
insecureSkipVerify: true
1013
selector:
1114
matchLabels:
1215
app: workflow-controller

‎scripts/add-PR-comment-for-deploy-to-dev.sh

+1-1
Original file line numberDiff line numberDiff line change
@@ -51,7 +51,7 @@ bin/infractl -k -e localhost:8443 whoami
5151
:rocket: If you only modify configuration (chart/infra-server/configuration) or templates (chart/infra-server/{static,templates}), you can get a faster update with:
5252
5353
\`\`\`
54-
make install-local
54+
make helm-deploy
5555
\`\`\`
5656
5757
### Logs

‎scripts/deploy/helm.sh

+6
Original file line numberDiff line numberDiff line change
@@ -34,6 +34,8 @@ template() {
3434
--create-namespace \
3535
--dry-run \
3636
--namespace "${RELEASE_NAMESPACE}" \
37+
--values chart/infra-server/argo-values.yaml \
38+
--values chart/infra-server/monitoring-values.yaml \
3739
--set tag="${TAG}" \
3840
--set environment="${ENVIRONMENT}" \
3941
--set testMode="${TEST_MODE}" \
@@ -57,6 +59,8 @@ deploy() {
5759
--timeout 5m \
5860
--wait \
5961
--namespace "${RELEASE_NAMESPACE}" \
62+
--values chart/infra-server/argo-values.yaml \
63+
--values chart/infra-server/monitoring-values.yaml \
6064
--set tag="${TAG}" \
6165
--set environment="${ENVIRONMENT}" \
6266
--set testMode="${TEST_MODE}" \
@@ -80,6 +84,8 @@ diff() {
8084
--create-namespace \
8185
--dry-run \
8286
--namespace "${RELEASE_NAMESPACE}" \
87+
--values chart/infra-server/argo-values.yaml \
88+
--values chart/infra-server/monitoring-values.yaml \
8389
--set tag="${TAG}" \
8490
--set environment="${ENVIRONMENT}" \
8591
--set testMode="${TEST_MODE}" \

0 commit comments

Comments
 (0)
Please sign in to comment.