Skip to content

Conversation

@djoshy
Copy link
Contributor

@djoshy djoshy commented Sep 17, 2025

- What I did

This PR dynamically updates the MCO operands' log levels in response to changes of the operatorLogLevel field in the MachineConfiguration object. I leveraged the loglevel helpers provided by library-go to dynamically change the operator pod's verbosity level and for the remaining operands, I added a new template variable which is rendered out to the daemon, server and controller manifests.

- How to verify it

  • Bring up a cluster with this PR.
  • Edit the MachineConfiguration object with:
$ oc edit MachineConfiguration cluster
  • The spec.operatorLogLevel field defaults to Normal. Other possible values are Debug, Trace and TraceAll. Set it to any other value, and observe the operator logs. It should note the log level change like so:
I0917 19:44:48.200522       1 operator.go:489] Log level changed from 2 to 6
  • You should also see the MCC, MCD and MCS pods restart. Check the manifests via oc(assuming jq is installed). This step might take a second based on the operator sync loop.
$ oc get daemonset machine-config-server -o json | jq '.spec.template.spec.containers[] | select(.name == "machine-config-server") | .args'
[
  "start",
  "--apiserver-url=https://api-int.djoshy-dev-101.gcp.devcluster.openshift.com:6443",
  "--payload-version=4.21.0-0.nightly-2025-09-15-042343",
  "--tls-cipher-suites=TLS_AES_128_GCM_SHA256,TLS_AES_256_GCM_SHA384,TLS_CHACHA20_POLY1305_SHA256,TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256,TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256,TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384,TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384,TLS_ECDHE_ECDSA_WITH_CHACHA20_POLY1305_SHA256,TLS_ECDHE_RSA_WITH_CHACHA20_POLY1305_SHA256",
  "--tls-min-version=VersionTLS12",
  "--v=4"
]


$ oc get daemonset machine-config-daemon -o json | jq '.spec.template.spec.containers[] | select(.name == "machine-config-daemon") | .args'
[
  "start",
  "--payload-version=4.21.0-0.nightly-2025-09-15-042343",
  "--v=4"
]


$ oc get deployment machine-config-controller -o json | jq '.spec.template.spec.containers[] | select(.name == "machine-config-controller") | .args'
[
  "start",
  "--resourcelock-namespace=openshift-machine-config-operator",
  "--v=4",
  "--payload-version=4.21.0-0.nightly-2025-09-15-042343"
]

Each of these manifests should have a verbosity level that is matching the one you'd set in the MachineConfiguration object. It is set to Debug in the above example, which corresponds to 4.

@openshift-ci-robot openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label Sep 17, 2025
@openshift-ci-robot
Copy link
Contributor

openshift-ci-robot commented Sep 17, 2025

@djoshy: This pull request references MCO-408 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.21.0" version, but no target version was set.

In response to this:

- What I did

This PR dynamically updates the MCO operands' log levels in response to changes of the operatorLogLevel field in the MachineConfiguration object. I leveraged the loglevel helpers provided by library-go to dynamically changed the operator pod's verbosity level, and for the remaining operands, I added a new template variable which is rendered out to the daemon, server and controller manifests.

- How to verify it

  • Bring up a cluster with this PR.
  • Edit the MachineConfiguration object with:
$ oc edit MachineConfiguration cluster
  • The spec.operatorLogLevel field defaults to Normal. Other possible values are Debug, Trace and TraceAll. Set it to any other value, and observe the operator logs. It should note the log level change like so:
I0917 19:44:48.200522       1 operator.go:489] Log level changed from 2 to 6
  • You should also see the MCC, MCD and MCS pods restart. Check the manifests via oc(assuming jq is installed). This step might take a second based on
$ oc get daemonset machine-config-server -o json | jq '.spec.template.spec.containers[] | select(.name == "machine-config-server") | .args'
[
 "start",
 "--apiserver-url=https://api-int.djoshy-dev-101.gcp.devcluster.openshift.com:6443",
 "--payload-version=4.21.0-0.nightly-2025-09-15-042343",
 "--tls-cipher-suites=TLS_AES_128_GCM_SHA256,TLS_AES_256_GCM_SHA384,TLS_CHACHA20_POLY1305_SHA256,TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256,TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256,TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384,TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384,TLS_ECDHE_ECDSA_WITH_CHACHA20_POLY1305_SHA256,TLS_ECDHE_RSA_WITH_CHACHA20_POLY1305_SHA256",
 "--tls-min-version=VersionTLS12",
 "--v=4"
]


$ oc get daemonset machine-config-daemon -o json | jq '.spec.template.spec.containers[] | select(.name == "machine-config-daemon") | .args'
[
 "start",
 "--payload-version=4.21.0-0.nightly-2025-09-15-042343",
 "--v=4"
]


$ oc get deployment machine-config-controller -o json | jq '.spec.template.spec.containers[] | select(.name == "machine-config-controller") | .args'
[
 "start",
 "--resourcelock-namespace=openshift-machine-config-operator",
 "--v=4",
 "--payload-version=4.21.0-0.nightly-2025-09-15-042343"
]

Each of these manifests should have a verbosity level that is matching the one you'd set in the MachineConfiguration object. It is set to Debug in the above example, which corresponds to 4.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-ci openshift-ci bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Sep 17, 2025
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Sep 17, 2025

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

@djoshy
Copy link
Contributor Author

djoshy commented Sep 17, 2025

/test all

@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Sep 17, 2025
@djoshy
Copy link
Contributor Author

djoshy commented Sep 18, 2025

/test verify-deps

@djoshy djoshy marked this pull request as ready for review September 18, 2025 13:23
@openshift-ci openshift-ci bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Sep 18, 2025
@openshift-ci-robot
Copy link
Contributor

openshift-ci-robot commented Sep 18, 2025

@djoshy: This pull request references MCO-408 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.21.0" version, but no target version was set.

In response to this:

- What I did

This PR dynamically updates the MCO operands' log levels in response to changes of the operatorLogLevel field in the MachineConfiguration object. I leveraged the loglevel helpers provided by library-go to dynamically changed the operator pod's verbosity level, and for the remaining operands, I added a new template variable which is rendered out to the daemon, server and controller manifests.

- How to verify it

  • Bring up a cluster with this PR.
  • Edit the MachineConfiguration object with:
$ oc edit MachineConfiguration cluster
  • The spec.operatorLogLevel field defaults to Normal. Other possible values are Debug, Trace and TraceAll. Set it to any other value, and observe the operator logs. It should note the log level change like so:
I0917 19:44:48.200522       1 operator.go:489] Log level changed from 2 to 6
  • You should also see the MCC, MCD and MCS pods restart. Check the manifests via oc(assuming jq is installed). This step might take a second based on the operator sync loop.
$ oc get daemonset machine-config-server -o json | jq '.spec.template.spec.containers[] | select(.name == "machine-config-server") | .args'
[
 "start",
 "--apiserver-url=https://api-int.djoshy-dev-101.gcp.devcluster.openshift.com:6443",
 "--payload-version=4.21.0-0.nightly-2025-09-15-042343",
 "--tls-cipher-suites=TLS_AES_128_GCM_SHA256,TLS_AES_256_GCM_SHA384,TLS_CHACHA20_POLY1305_SHA256,TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256,TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256,TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384,TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384,TLS_ECDHE_ECDSA_WITH_CHACHA20_POLY1305_SHA256,TLS_ECDHE_RSA_WITH_CHACHA20_POLY1305_SHA256",
 "--tls-min-version=VersionTLS12",
 "--v=4"
]


$ oc get daemonset machine-config-daemon -o json | jq '.spec.template.spec.containers[] | select(.name == "machine-config-daemon") | .args'
[
 "start",
 "--payload-version=4.21.0-0.nightly-2025-09-15-042343",
 "--v=4"
]


$ oc get deployment machine-config-controller -o json | jq '.spec.template.spec.containers[] | select(.name == "machine-config-controller") | .args'
[
 "start",
 "--resourcelock-namespace=openshift-machine-config-operator",
 "--v=4",
 "--payload-version=4.21.0-0.nightly-2025-09-15-042343"
]

Each of these manifests should have a verbosity level that is matching the one you'd set in the MachineConfiguration object. It is set to Debug in the above example, which corresponds to 4.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-ci-robot
Copy link
Contributor

openshift-ci-robot commented Sep 18, 2025

@djoshy: This pull request references MCO-408 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.21.0" version, but no target version was set.

In response to this:

- What I did

This PR dynamically updates the MCO operands' log levels in response to changes of the operatorLogLevel field in the MachineConfiguration object. I leveraged the loglevel helpers provided by library-go to dynamically change the operator pod's verbosity level and for the remaining operands, I added a new template variable which is rendered out to the daemon, server and controller manifests.

- How to verify it

  • Bring up a cluster with this PR.
  • Edit the MachineConfiguration object with:
$ oc edit MachineConfiguration cluster
  • The spec.operatorLogLevel field defaults to Normal. Other possible values are Debug, Trace and TraceAll. Set it to any other value, and observe the operator logs. It should note the log level change like so:
I0917 19:44:48.200522       1 operator.go:489] Log level changed from 2 to 6
  • You should also see the MCC, MCD and MCS pods restart. Check the manifests via oc(assuming jq is installed). This step might take a second based on the operator sync loop.
$ oc get daemonset machine-config-server -o json | jq '.spec.template.spec.containers[] | select(.name == "machine-config-server") | .args'
[
 "start",
 "--apiserver-url=https://api-int.djoshy-dev-101.gcp.devcluster.openshift.com:6443",
 "--payload-version=4.21.0-0.nightly-2025-09-15-042343",
 "--tls-cipher-suites=TLS_AES_128_GCM_SHA256,TLS_AES_256_GCM_SHA384,TLS_CHACHA20_POLY1305_SHA256,TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256,TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256,TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384,TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384,TLS_ECDHE_ECDSA_WITH_CHACHA20_POLY1305_SHA256,TLS_ECDHE_RSA_WITH_CHACHA20_POLY1305_SHA256",
 "--tls-min-version=VersionTLS12",
 "--v=4"
]


$ oc get daemonset machine-config-daemon -o json | jq '.spec.template.spec.containers[] | select(.name == "machine-config-daemon") | .args'
[
 "start",
 "--payload-version=4.21.0-0.nightly-2025-09-15-042343",
 "--v=4"
]


$ oc get deployment machine-config-controller -o json | jq '.spec.template.spec.containers[] | select(.name == "machine-config-controller") | .args'
[
 "start",
 "--resourcelock-namespace=openshift-machine-config-operator",
 "--v=4",
 "--payload-version=4.21.0-0.nightly-2025-09-15-042343"
]

Each of these manifests should have a verbosity level that is matching the one you'd set in the MachineConfiguration object. It is set to Debug in the above example, which corresponds to 4.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Sep 18, 2025

@djoshy: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/e2e-azure-ovn-upgrade-out-of-change c2a2784 link false /test e2e-azure-ovn-upgrade-out-of-change
ci/prow/e2e-aws-mco-disruptive c2a2784 link false /test e2e-aws-mco-disruptive
ci/prow/e2e-gcp-op-ocl c2a2784 link false /test e2e-gcp-op-ocl
ci/prow/bootstrap-unit c2a2784 link false /test bootstrap-unit
ci/prow/e2e-gcp-mco-disruptive c2a2784 link false /test e2e-gcp-mco-disruptive

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@isabella-janssen
Copy link
Member

/retest-required

Copy link
Member

@isabella-janssen isabella-janssen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

Changes look fair to me!

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Sep 23, 2025
Copy link
Contributor

@dkhater-redhat dkhater-redhat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good to me. thanks for this!

Copy link
Contributor

@pablintino pablintino left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm
Agree with the change.

}

func getRenderConfig(tnamespace, kubeAPIServerServingCA string, ccSpec *mcfgv1.ControllerConfigSpec, imgs *ctrlcommon.RenderConfigImages, infra *configv1.Infrastructure, pointerConfigData []byte, apiServer *configv1.APIServer) *renderConfig {
func getRenderConfig(tnamespace, kubeAPIServerServingCA string, ccSpec *mcfgv1.ControllerConfigSpec, imgs *ctrlcommon.RenderConfigImages, infra *configv1.Infrastructure, pointerConfigData []byte, apiServer *configv1.APIServer, logLevel string) *renderConfig {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: This function starts to ask for applying the builder pattern to construct renderConfig. 8 args in a function is a reasonable limit, but I'd not advocate allowing many more.
(just my 2cents, no action in your change)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Totally fair 👍

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Sep 23, 2025

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: djoshy, dkhater-redhat, isabella-janssen, pablintino

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:
  • OWNERS [djoshy,dkhater-redhat,isabella-janssen,pablintino]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@ptalgulk01
Copy link

Pre-merge verified:
Verified using IPI based AWS cluster.

  • Edit the machineconfiguration and patch the follwing options Debug, Trace and TraceAll in spec.operatorLogLevel field
$ oc patch machineconfigoperator cluster --type=merge -p '{"spec":{"operatorLogLevel":"Trace"}}'

$ oc get machineconfiguration
...
  spec:
    logLevel: Normal
    managementState: Managed
    operatorLogLevel: Trace
  • Able to see MCC, MCS and MCD restarted
$oc get po
NAME                                                             READY   STATUS    RESTARTS        AGE
..
machine-config-controller-d559cbf65-bkswm                        2/2     Running   0               57s
machine-config-daemon-csgbr                                      2/2     Running   0               58s
machine-config-daemon-l9ccv                                      2/2     Running   0               60s
machine-config-daemon-qvgtt                                      2/2     Running   0               62s
machine-config-daemon-rw85b                                      2/2     Running   0               67s
machine-config-daemon-t6tzp                                      2/2     Running   0               64s
machine-config-daemon-tlcb7                                      2/2     Running   0               68s
machine-config-operator-59b9c7b8df-f6wwf                         2/2     Running   1 (3h47m ago)   3h55m
machine-config-server-5b4qt                                      1/1     Running   0               53s
machine-config-server-jknn5                                      1/1     Running   0               52s
machine-config-server-t7vtw                                      1/1     Running   0               51s
$ oc logs machine-config-operator-59b9c7b8df-f6wwf | grep -i level
Defaulted container "machine-config-operator" out of: machine-config-operator, kube-rbac-proxy
I0924 09:19:57.724751       1 operator.go:489] Log level changed from 2 to 6
  • Able to see same loglevel value here
$ oc get deployment machine-config-controller -o json | jq '.spec.template.spec.containers[] | select(.name == "machine-config-controller") | .args'
[
  "start",
  "--resourcelock-namespace=openshift-machine-config-operator",
  "--v=6",
  "--payload-version=4.21.0-0-2025-09-24-050658-test-ci-ln-15bhnsb-latest"
]

$ oc get daemonset machine-config-daemon -o json | jq '.spec.template.spec.containers[] | select(.name == "machine-config-daemon") | .args'
[
  "start",
  "--payload-version=4.21.0-0-2025-09-24-050658-test-ci-ln-15bhnsb-latest",
  "--v=6"
]

$ oc get daemonset machine-config-server -o json | jq '.spec.template.spec.containers[] | select(.name == "machine-config-server") | .args'
[
  "start",
  "--apiserver-url=https://api-int.david-2409a.qe.devcluster.openshift.com:6443",
  "--payload-version=4.21.0-0-2025-09-24-050658-test-ci-ln-15bhnsb-latest",
  "--tls-cipher-suites=TLS_AES_128_GCM_SHA256,TLS_AES_256_GCM_SHA384,TLS_CHACHA20_POLY1305_SHA256,TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256,TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256,TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384,TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384,TLS_ECDHE_ECDSA_WITH_CHACHA20_POLY1305_SHA256,TLS_ECDHE_RSA_WITH_CHACHA20_POLY1305_SHA256",
  "--tls-min-version=VersionTLS12",
  "--v=6"
]

Only question I have here is if I patched the empty string value in machineconfiguration the log level value is moved to 2 that is default but in machineconfiguration Normal is not patched back.

  spec:
    logLevel: Normal
    managementState: Managed
    operatorLogLevel: ""
$ oc logs machine-config-operator-59b9c7b8df-f6wwf | grep -i level
Defaulted container "machine-config-operator" out of: machine-config-operator, kube-rbac-proxy
I0924 09:30:40.639900       1 operator.go:489] Log level changed from 8 to 2

Is this expected? Other than looks good to me

@djoshy
Copy link
Contributor Author

djoshy commented Sep 30, 2025

Is this expected? Other than looks good to me

This is expected, it looks like the helper defaults to 2 for unknown values(an empty string would be that case). I'm surprised an empty string is even allowed, the API seems to default to normal?

@ptalgulk01
Copy link

I see this is expected then, thankyou for clearing the doubt!

@ptalgulk01
Copy link

/qe-approved

@ptalgulk01
Copy link

/verified by @ptalgulk01

@openshift-ci-robot openshift-ci-robot added the verified Signifies that the PR passed pre-merge verification criteria label Sep 30, 2025
@openshift-ci-robot
Copy link
Contributor

@ptalgulk01: This PR has been marked as verified by @ptalgulk01.

In response to this:

/verified by @ptalgulk01

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-ci-robot
Copy link
Contributor

/retest-required

Remaining retests: 0 against base HEAD 4c5e822 and 2 for PR HEAD c2a2784 in total

@openshift-merge-bot openshift-merge-bot bot merged commit adb087c into openshift:main Sep 30, 2025
17 of 22 checks passed
@djoshy djoshy deleted the add-op-log-level branch October 3, 2025 18:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. lgtm Indicates that a PR is ready to be merged. verified Signifies that the PR passed pre-merge verification criteria

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants