Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug DatadogAgent CRD status agent list bloated causing etcdserver request too large #1561

Open
ii64 opened this issue Dec 12, 2024 · 2 comments

Comments

@ii64
Copy link

ii64 commented Dec 12, 2024

Encountered an issue where Terraform is unable to delete DatadogAgent object that has been error'd out for couple of hours, turns out that was caused by the admission controller delete request returned an error due to the etcdserver rejecting request because of the request is too large. Checked out the CRD object and found that there are a lot of .Status.AgentList.

# wc -l a.log
112276 a.log

# kubectl delete -n dd-redacted-app-production datadogagents.datadoghq.com dd-redacted-app
Error from server: etcdserver: request is too large

# kubectl apply -f ./b.json
Warning: resource datadogagents/dd-redacted-app is missing the kubectl.kubernetes.io/last-applied-configuration annotation which is required by kubectl apply. kubectl apply should only be used on resources created declaratively by either kubectl create --save-config or kubectl apply. The missing annotation will be patched automatically.
Error from server: error when applying patch:
[... redacted ...]
to:
Resource: "datadoghq.com/v2alpha1, Resource=datadogagents", GroupVersionKind: "datadoghq.com/v2alpha1, Kind=DatadogAgent"
Name: "dd-redacted-app", Namespace: "dd-redacted-app-production"
for: "./b.json": error when patching "./b.json": etcdserver: request is too large
Name:         dd-redacted-app
Namespace:    dd-redacted-app-production
Labels:       <none>
Annotations:  <none>
API Version:  datadoghq.com/v2alpha1
Kind:         DatadogAgent
Metadata:
  Creation Timestamp:  2024-12-11T21:10:35Z
  Finalizers:
    finalizer.agent.datadoghq.com
  Generation:        1
  Resource Version:  1234472018
  UID:               85e7b088-f8d6-47e3-96f7-054dc7621c64
Spec:
  Global:
    Cluster Name:  redacted-app
    Credentials:
      API Secret:
        Key Name:     api-key
        Secret Name:  dd-redacted-app-secrets
    Site:             us5.datadoghq.com
Status:
  Agent:
    Available:   0
    Current:     0
    Desired:     0
    Ready:       0
    State:       Failed
    Status:      Failed (0/0/0)
    Up To Date:  0
  Agent List:
    Available:   0
    Current:     0
    Desired:     0
    Ready:       0
    State:       Failed
    Status:      Failed
    Up To Date:  0
    Available:   0
    Current:     0
    Desired:     0
    Ready:       0
    State:       Failed
    Status:      Failed
    Up To Date:  0
    Available:   0
    Current:     0
    Desired:     0
    Ready:       0
    State:       Failed
    Status:      Failed
    Up To Date:  0
    Available:   0
    Current:     0
    Desired:     0
    Ready:       0
    State:       Failed
    Status:      Failed
    Up To Date:  0
    Available:   0
    Current:     0
    Desired:     0
    Ready:       0
    State:       Failed
    Status:      Failed
    Up To Date:  0
    Available:   0
    Current:     0
    Desired:     0
    Ready:       0
    State:       Failed
    Status:      Failed
    Up To Date:  0
    Available:   0
    Current:     0
    Desired:     0
    Ready:       0
    State:       Failed
    Status:      Failed
    Up To Date:  0
    Available:   0
    Current:     0
    Desired:     0
    Ready:       0
    State:       Failed
    Status:      Failed
    Up To Date:  0
    Available:   0
    Current:     0
[... truncated ...]
@levan-m
Copy link
Contributor

levan-m commented Dec 27, 2024

Hello @ii64,
Thanks for submitting the issue. Are you able to reproduce the issue?
Could you please share DatadogAgent manifest, Operator, chart version and helm overrides?

@ii64
Copy link
Author

ii64 commented Jan 4, 2025

We moved from operator to DaemonSet approach. It's basically all on the first comment without gigantic agent status list.
For the operator chart version and override values:

resource "helm_release" "datadog_operator" {
  name = "datadog"
  chart = "datadog-operator"
  repository = "https://helm.datadoghq.com"
  version = "2.4.0"
  namespace = kubernetes_namespace.datadog.id

  values = [
    yamlencode({})
  ]
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants