Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DatadogDashboard CRD not working as expected #1556

Open
miguel-cardoso-mindera opened this issue Dec 11, 2024 · 2 comments
Open

DatadogDashboard CRD not working as expected #1556

miguel-cardoso-mindera opened this issue Dec 11, 2024 · 2 comments
Labels
bug Something isn't working

Comments

@miguel-cardoso-mindera
Copy link

I'm trying to deploy dashboards as k8s manifests using the datadog operator and the CRD, however I'm having some issues where the operator pod is showing some errors and I end up with duplicate Dashboards on Datadog.

As an example, I have a dashboard manually created in Datadog called WTX:Image

I copy the k8s manifest and create the CRD:

apiVersion: datadoghq.com/v1alpha1
kind: DatadogDashboard
metadata:
  name: racingservicewtx
spec:
  title: Racing - Service WTX (cloned)
  description: "[[suggested_dashboards]]"
  widgets: '[{"id":4008919312082391,"definition":{"title":"Racing -
    Kubernetes (cloned)","show_title":true,"powerpack_id":"df9b2cac-b181-11ef-ba6f-da7ad0900002","template_variables":{"controlled_externally":[{"name":"env","prefix":"env","values":["qa"]},{"name":"product","prefix":"product","values":["tvg"]},{"name":"service","prefix":"kube_service","values":["*"]},{"name":"pod","prefix":"pod_name","values":["*"]}],"controlled_by_powerpack":[]},"type":"powerpack"},"layout":{"x":0,"y":0,"width":12,"height":3}},{"id":7597434119089317,"definition":{"title":"Racing
    - JVM
    metrics","show_title":true,"powerpack_id":"b730061c-ae60-11ef-b133-da7ad0900002","template_variables":{"controlled_externally":[{"name":"env","prefix":"env","values":["qa"]},{"name":"product","prefix":"product","values":["tvg"]},{"name":"service","prefix":"service","values":["*"]},{"name":"pod","prefix":"pod_name","values":["*"]}],"controlled_by_powerpack":[]},"type":"powerpack"},"layout":{"x":0,"y":3,"width":12,"height":7}},{"id":2160931084822284,"definition":{"title":"Racing
    - Micrometer
    HTTP","show_title":true,"powerpack_id":"29bda5c2-ae61-11ef-9a5b-da7ad0900002","template_variables":{"controlled_externally":[{"name":"env","prefix":"env","values":["qa"]},{"name":"product","prefix":"product","values":["tvg"]},{"name":"service","prefix":"service","values":["*"]},{"name":"pod","prefix":"pod_name","values":["*"]},{"name":"uri","prefix":"uri","values":["*"]}],"controlled_by_powerpack":[]},"type":"powerpack"},"layout":{"x":0,"y":10,"width":12,"height":5}},{"id":6929931258377365,"definition":{"title":"Business
    Model","show_title":true,"type":"group","layout_type":"ordered","widgets":[{"id":5243852364613936,"definition":{"title":"Consumer
    Delay
    p95","title_size":"16","title_align":"left","show_legend":true,"legend_layout":"auto","legend_columns":["avg","min","max","value","sum"],"type":"timeseries","requests":[{"formulas":[{"formula":"query1"}],"queries":[{"name":"query1","data_source":"metrics","query":"p95:racing.tvg_stream_consumer_delay_seconds{$env,$product,$service,$pod}
    by
    {topic}"}],"response_format":"timeseries","style":{"palette":"dog_classic","order_by":"values","line_type":"solid","line_width":"normal"},"display_type":"line"}]},"layout":{"x":0,"y":0,"width":4,"height":2}},{"id":2224824266258060,"definition":{"title":"Consumer
    Delay","title_size":"16","title_align":"left","show_legend":true,"legend_layout":"auto","legend_columns":["avg","min","max","value","sum"],"type":"timeseries","requests":[{"formulas":[{"formula":"query1
    /
    query2","number_format":{"unit":{"type":"canonical_unit","unit_name":"second"}}}],"queries":[{"name":"query1","data_source":"metrics","query":"sum:racing.tvg_stream_consumer_delay_seconds.sum{$env,$product,$service,$pod}
    by
    {topic}.as_rate()"},{"name":"query2","data_source":"metrics","query":"sum:racing.tvg_stream_consumer_delay_seconds.count{$env,$product,$service,$pod}
    by
    {topic}.as_rate()"}],"response_format":"timeseries","style":{"palette":"dog_classic","order_by":"values","line_type":"solid","line_width":"normal"},"display_type":"line"}]},"layout":{"x":4,"y":0,"width":4,"height":2}},{"id":6608659914707699,"definition":{"title":"Total
    wagers amount placed by
    tote","title_size":"16","title_align":"left","show_legend":true,"legend_layout":"auto","legend_columns":["avg","min","max","value","sum"],"type":"timeseries","requests":[{"formulas":[{"formula":"clamp_min(query1
    - query2,
    0)","number_format":{"unit":{"type":"canonical_unit","unit_name":"dollar"}}}],"queries":[{"name":"query1","data_source":"metrics","query":"sum:racing.WagersAmountByTote.count{$env,$product,$service,$pod}
    by
    {tote}.as_count()"},{"name":"query2","data_source":"metrics","query":"sum:racing.WagersAmountByTote.count{$env,$product,$service,$pod}
    by {tote}.as_count().rollup(sum,
    30).fill(0)"}],"response_format":"timeseries","style":{"palette":"warm","order_by":"values","order_reverse":false,"color_order":"monotonic","line_type":"solid","line_width":"normal"},"display_type":"bars"}]},"layout":{"x":8,"y":0,"width":4,"height":2}},{"id":6748881229470156,"definition":{"title":"DB
    Method Execution
    Times","title_size":"16","title_align":"left","show_legend":true,"legend_layout":"auto","legend_columns":["avg","min","max","value","sum"],"type":"timeseries","requests":[{"formulas":[{"formula":"query1
    /
    query2","number_format":{"unit":{"type":"canonical_unit","unit_name":"second"}}}],"queries":[{"name":"query1","data_source":"metrics","query":"sum:racing.tvg_database_calls_seconds.sum{$env,$product,$service,$pod}
    by
    {method}.as_rate()"},{"name":"query2","data_source":"metrics","query":"sum:racing.tvg_database_calls_seconds.count{$env,$product,$service,$pod}
    by
    {method}.as_rate()"}],"response_format":"timeseries","style":{"palette":"dog_classic","order_by":"values","line_type":"solid","line_width":"normal"},"display_type":"line"}]},"layout":{"x":0,"y":2,"width":4,"height":2}},{"id":7793125210108627,"definition":{"title":"Mappers
    Time","title_size":"16","title_align":"left","show_legend":true,"legend_layout":"auto","legend_columns":["avg","min","max","value","sum"],"type":"timeseries","requests":[{"formulas":[{"number_format":{"unit":{"type":"canonical_unit","unit_name":"second"}},"formula":"query1
    /
    query2"}],"queries":[{"name":"query1","data_source":"metrics","query":"sum:racing.tvg_stream_mapper_seconds.sum{$env,$product,$service,$pod}
    by
    {name}.as_rate()"},{"name":"query2","data_source":"metrics","query":"sum:racing.tvg_stream_mapper_seconds.count{$env,$product,$service,$pod}
    by
    {name}.as_rate()"}],"response_format":"timeseries","style":{"palette":"dog_classic","order_by":"values","line_type":"solid","line_width":"normal"},"display_type":"line"}]},"layout":{"x":4,"y":2,"width":4,"height":2}},{"id":3540650562238363,"definition":{"title":"Wagers
    amount by wager
    type","title_size":"16","title_align":"left","show_legend":true,"legend_layout":"auto","legend_columns":["avg","min","max","value","sum"],"type":"timeseries","requests":[{"formulas":[{"number_format":{"unit":{"type":"canonical_unit","unit_name":"dollar"}},"formula":"clamp_min(query1
    - query2,
    0)"}],"queries":[{"name":"query1","data_source":"metrics","query":"sum:racing.WagersAmountByWagerType.count{$env,$product,$service,$pod}
    by
    {wagertype}.as_count()"},{"name":"query2","data_source":"metrics","query":"sum:racing.WagersAmountByWagerType.count{$env,$product,$service,$pod}
    by {wagertype}.as_count().fill(0).rollup(sum,
    30)"}],"response_format":"timeseries","style":{"palette":"warm","order_by":"values","line_type":"solid","line_width":"normal"},"display_type":"bars"}]},"layout":{"x":8,"y":2,"width":4,"height":2}},{"id":6230023390746062,"definition":{"title":"Cancel
    wagers by
    tote","title_size":"16","title_align":"left","show_legend":true,"legend_layout":"auto","legend_columns":["avg","min","max","value","sum"],"type":"timeseries","requests":[{"formulas":[{"formula":"query1"}],"queries":[{"name":"query1","data_source":"metrics","query":"sum:racing.WagersCanceled.count{$env,$product,$service,$pod}
    by
    {tote}.as_rate()"}],"response_format":"timeseries","style":{"palette":"dog_classic","order_by":"values","line_type":"solid","line_width":"normal"},"display_type":"bars"}]},"layout":{"x":0,"y":4,"width":4,"height":2}},{"id":8753148913537012,"definition":{"title":"Amount
    wagered by
    Brand","title_size":"16","title_align":"left","requests":[{"formulas":[{"formula":"query1","number_format":{"unit":{"type":"canonical_unit","unit_name":"dollar"}}}],"queries":[{"name":"query1","data_source":"metrics","query":"sum:racing.WagersAmountByProductBrand.count{$env,$product,$service,$pod}
    by
    {brand}.as_count()","aggregator":"sum"}],"response_format":"scalar","sort":{"count":500,"order_by":[{"type":"formula","index":0,"order":"desc"}]},"style":{"palette":"datadog16"}}],"type":"sunburst","legend":{"type":"automatic"}},"layout":{"x":4,"y":4,"width":4,"height":2}},{"id":1568955280607725,"definition":{"title":"Wagers
    amount by
    context","title_size":"16","title_align":"left","show_legend":true,"legend_layout":"auto","legend_columns":["avg","min","max","value","sum"],"type":"timeseries","requests":[{"formulas":[{"formula":"clamp_min(query1
    - query2,
    0)","number_format":{"unit":{"type":"canonical_unit","unit_name":"dollar"}}}],"queries":[{"name":"query1","data_source":"metrics","query":"sum:racing.WagersAmountByProductBrand.count{$env,$product,$service,$pod}
    by
    {product,brand}.as_count()"},{"name":"query2","data_source":"metrics","query":"sum:racing.WagersAmountByProductBrand.count{$env,$product,$service,$pod}
    by {product,brand}.as_count().rollup(sum,
    30).fill(0)"}],"response_format":"timeseries","style":{"palette":"warm","order_by":"values","line_type":"solid","line_width":"normal"},"display_type":"bars"}]},"layout":{"x":8,"y":4,"width":4,"height":2}},{"id":434031825406107,"definition":{"title":"Amount
    wagered by
    Track","title_size":"16","title_align":"left","requests":[{"formulas":[{"formula":"query1","number_format":{"unit":{"type":"canonical_unit","unit_name":"dollar"}}}],"queries":[{"name":"query1","data_source":"metrics","query":"sum:racing.WagersAmountByTrackRace.count{$env,$product,$service,$pod}
    by
    {wagertrack}.as_count()","aggregator":"sum"}],"response_format":"scalar","sort":{"count":500,"order_by":[{"type":"formula","index":0,"order":"desc"}]},"style":{"palette":"datadog16"}}],"type":"sunburst","legend":{"type":"automatic"}},"layout":{"x":4,"y":6,"width":4,"height":2}},{"id":2439303677267511,"definition":{"title":"Wagers
    placed by
    track","title_size":"16","title_align":"left","show_legend":true,"legend_layout":"auto","legend_columns":["avg","min","max","value","sum"],"type":"timeseries","requests":[{"formulas":[{"number_format":{"unit":{"type":"canonical_unit","unit_name":"dollar"}},"formula":"clamp_min(query1
    - query2,
    0)"}],"queries":[{"name":"query1","data_source":"metrics","query":"sum:racing.WagersAmountByTrackRace.count{$env,$product,$service,$pod}
    by
    {wagertrack,race}.as_count()"},{"name":"query2","data_source":"metrics","query":"sum:racing.WagersAmountByTrackRace.count{$env,$product,$service,$pod}
    by {wagertrack,race}.as_count().rollup(sum,
    30).fill(0)"}],"response_format":"timeseries","style":{"palette":"dog_classic","order_by":"values","line_type":"solid","line_width":"normal"},"display_type":"bars"}]},"layout":{"x":8,"y":6,"width":4,"height":2}}]},"layout":{"x":0,"y":15,"width":12,"height":9,"is_column_break":true}}]'
  templateVariables:
    - name: env
      prefix: env
      availableValues:
        - qa
        - staging
        - production
      defaults:
        - qa
    - name: product
      prefix: product
      availableValues:
        - tvg
      defaults:
        - tvg
    - name: service
      prefix: kube_service
      availableValues:
        - service-wtx
      defaults:
        - service-wtx
    - name: pod
      prefix: pod_name
      availableValues: []
      defaults:
        - "*"
    - name: uri
      prefix: uri
      availableValues: []
      defaults:
        - "*"
  layoutType: ordered
  notifyList: []
  templateVariablePresets:
    - name: QA
      templateVariables:
        - name: product
          values:
            - tvg
        - name: application
          values:
            - wtx
        - name: env
          values:
            - qa
  reflowType: fixed
  tags:
    - team:racing

Then in Datadog, instead of having only one cloned dashboard, there is two:
Image

And the operator shows these logs:

{"level":"INFO","ts":"2024-12-11T12:10:40.847Z","logger":"setup","msg":"Version: v1.10.0"}                                                                                                                         
{"level":"INFO","ts":"2024-12-11T12:10:40.847Z","logger":"setup","msg":"Build time: 2024-11-08/16:43:46"}                                                                                                          
{"level":"INFO","ts":"2024-12-11T12:10:40.847Z","logger":"setup","msg":"Git Commit: 2bbda7adace27de3d397b3d76d87fbd49fa304e3"}                                                                                     
{"level":"INFO","ts":"2024-12-11T12:10:40.847Z","logger":"setup","msg":"Go Version: go1.22.7"}                                                                                                                     
{"level":"INFO","ts":"2024-12-11T12:10:40.847Z","logger":"setup","msg":"Go OS/Arch: linux/amd64"}                                                                                                                  
{"level":"INFO","ts":"2024-12-11T12:10:40.847Z","logger":"setup","msg":"CRD-specific namespaces environmental variable DD_MONITOR_WATCH_NAMESPACE not set, will be using common config"}                           
{"level":"INFO","ts":"2024-12-11T12:10:40.847Z","logger":"setup","msg":"DatadogMonitor Enabled","watching namespaces":["datadog"]}                                                                                 
{"level":"INFO","ts":"2024-12-11T12:10:40.847Z","logger":"setup","msg":"CRD-specific namespaces environmental variable DD_AGENT_WATCH_NAMESPACE not set, will be using common config"}                             
{"level":"INFO","ts":"2024-12-11T12:10:40.859Z","logger":"setup","msg":"configuring manager health check","maximumGoroutines":400}                                                                                 
{"level":"INFO","ts":"2024-12-11T12:10:41.910Z","logger":"klog","msg":"Waited for 1.046145252s due to client-side throttling, not priority and fairness, request: GET:https://10.223.0.1:443/apis/networking.k8s.io/v1?timeout=32s\n"}                                                                                                                                                                                               
{"level":"INFO","ts":"2024-12-11T12:10:42.662Z","logger":"setup","msg":"Feature disabled, not starting the controller","controller":"DatadogSLO"}                                                                  
{"level":"INFO","ts":"2024-12-11T12:10:42.662Z","logger":"setup","msg":"Feature disabled, not starting the controller","controller":"DatadogAgentProfile"}                                                         
{"level":"INFO","ts":"2024-12-11T12:10:42.662Z","logger":"setup","msg":"Feature disabled, not starting the controller","controller":"DatadogAgent"}                                                                
{"level":"INFO","ts":"2024-12-11T12:10:42.662Z","logger":"setup","msg":"starting manager"}                                                                                                                         
{"level":"INFO","ts":"2024-12-11T12:10:42.662Z","logger":"controller-runtime.metrics","msg":"Starting metrics server"}                                                                                             
{"level":"INFO","ts":"2024-12-11T12:10:42.662Z","msg":"starting server","kind":"health probe","addr":"[::]:8081"}                                                                                                  
{"level":"INFO","ts":"2024-12-11T12:10:42.662Z","logger":"controller-runtime.metrics","msg":"Serving metrics server","bindAddress":":8383","secure":false}                                                         
{"level":"INFO","ts":"2024-12-11T12:10:42.663Z","logger":"klog","msg":"attempting to acquire leader lease datadog/datadog-operator-lock...\n"}                                                                     
{"level":"INFO","ts":"2024-12-11T12:11:44.811Z","logger":"klog","msg":"successfully acquired lease datadog/datadog-operator-lock\n"}                                                                               
{"level":"DEBUG","ts":"2024-12-11T12:11:44.811Z","logger":"events","msg":"microservices-qa-new-datadog-operator-85f48db4f4-sjnsv_485f30ae-36b0-4e2f-a6cc-edc753d39bda became leader","type":"Normal","object":{"kind":"Lease","namespace":"datadog","name":"datadog-operator-lock","uid":"28658b4c-75fd-4f22-8cb2-d0d032b9e639","apiVersion":"coordination.k8s.io/v1","resourceVersion":"1016904954"},"reason":"LeaderElection"}     
{"level":"INFO","ts":"2024-12-11T12:11:44.811Z","msg":"Starting EventSource","controller":"datadogdashboard","controllerGroup":"datadoghq.com","controllerKind":"DatadogDashboard","source":"kind source: *v1alpha1.DatadogDashboard"}                                                                                                                                                                                               
{"level":"INFO","ts":"2024-12-11T12:11:44.812Z","msg":"Starting Controller","controller":"datadogdashboard","controllerGroup":"datadoghq.com","controllerKind":"DatadogDashboard"}                                 
{"level":"INFO","ts":"2024-12-11T12:11:44.811Z","msg":"Starting EventSource","controller":"datadogmonitor","controllerGroup":"datadoghq.com","controllerKind":"DatadogMonitor","source":"kind source: *v1alpha1.DatadogMonitor"}                                                                                                                                                                                                     
{"level":"INFO","ts":"2024-12-11T12:11:44.812Z","msg":"Starting Controller","controller":"datadogmonitor","controllerGroup":"datadoghq.com","controllerKind":"DatadogMonitor"}                                     
{"level":"INFO","ts":"2024-12-11T12:11:44.912Z","msg":"Starting workers","controller":"datadogmonitor","controllerGroup":"datadoghq.com","controllerKind":"DatadogMonitor","worker count":1}                       
{"level":"INFO","ts":"2024-12-11T12:11:44.912Z","msg":"Starting workers","controller":"datadogdashboard","controllerGroup":"datadoghq.com","controllerKind":"DatadogDashboard","worker count":1}                   
{"level":"INFO","ts":"2024-12-11T12:15:12.840Z","logger":"controllers.DatadogDashboard","msg":"Reconciling Datadog Dashboard","datadogdashboard":{"name":"wtx","namespace":"datadog"}}                             
{"level":"INFO","ts":"2024-12-11T12:15:12.840Z","logger":"controllers.DatadogDashboard","msg":"Adding Finalizer for the DatadogDashboard","datadogdashboard":{"name":"wtx","namespace":"datadog"}}                 
{"level":"INFO","ts":"2024-12-11T12:15:12.859Z","logger":"controllers.DatadogDashboard","msg":"Reconciling Datadog Dashboard","datadogdashboard":{"name":"wtx","namespace":"datadog"}}                             
{"level":"DEBUG","ts":"2024-12-11T12:15:12.859Z","logger":"controllers.DatadogDashboard","msg":"Dashboard ID is not set; creating Dashboard in Datadog","datadogdashboard":{"name":"wtx","namespace":"datadog"}}   
{"level":"INFO","ts":"2024-12-11T12:15:13.679Z","logger":"controllers.DatadogDashboard","msg":"created a new DatadogDashboard","datadogdashboard":{"name":"wtx","namespace":"datadog"},"dashboard ID":"95n-qpz-bah"}                                                                                                                                                                                                                 
{"level":"DEBUG","ts":"2024-12-11T12:15:13.680Z","logger":"events","msg":"datadog/wtx","type":"Normal","object":{"kind":"DatadogDashboard","namespace":"datadog","name":"wtx","uid":"e6cd96ac-3e7b-40e0-9c0c-52455d66bb4a","apiVersion":"datadoghq.com/v1alpha1","resourceVersion":"1016909282"},"reason":"Create DatadogDashboard"}                                                                                                 
{"level":"INFO","ts":"2024-12-11T12:15:13.695Z","logger":"controllers.DatadogDashboard","msg":"Reconciling Datadog Dashboard","datadogdashboard":{"name":"wtx","namespace":"datadog"}}                             
{"level":"DEBUG","ts":"2024-12-11T12:15:13.696Z","logger":"controllers.DatadogDashboard","msg":"Dashboard ID is not set; creating Dashboard in Datadog","datadogdashboard":{"name":"wtx","namespace":"datadog"}}   
{"level":"INFO","ts":"2024-12-11T12:15:14.263Z","logger":"controllers.DatadogDashboard","msg":"created a new DatadogDashboard","datadogdashboard":{"name":"wtx","namespace":"datadog"},"dashboard ID":"7k7-5nx-2c7"}                                                                                                                                                                                                                 
{"level":"DEBUG","ts":"2024-12-11T12:15:14.263Z","logger":"events","msg":"datadog/wtx","type":"Normal","object":{"kind":"DatadogDashboard","namespace":"datadog","name":"wtx","uid":"e6cd96ac-3e7b-40e0-9c0c-52455d66bb4a","apiVersion":"datadoghq.com/v1alpha1","resourceVersion":"1016909282"},"reason":"Create DatadogDashboard"}                                                                                                 
{"level":"ERROR","ts":"2024-12-11T12:15:14.276Z","logger":"controllers.DatadogDashboard","msg":"unable to update DatadogDashboard status due to update conflict","datadogdashboard":{"name":"wtx","namespace":"datadog"},"error":"Operation cannot be fulfilled on datadogdashboards.datadoghq.com \"wtx\": the object has been modified; please apply your changes to the latest version and try again"}                            
{"level":"INFO","ts":"2024-12-11T12:15:19.276Z","logger":"controllers.DatadogDashboard","msg":"Reconciling Datadog Dashboard","datadogdashboard":{"name":"wtx","namespace":"datadog"}}

I'm using version 1.10.0 of the helm operator:

apiKeyExistingSecret: datadog-setup-secrets
appKeyExistingSecret: datadog-setup-secrets

replicaCount: 1
logLevel: "debug"
image:
  tag: 1.10.0

supportExtendedDaemonset: "false"
operatorMetricsEnabled: "true"

introspection:
  enabled: false

datadogAgentProfile:
  enabled: false

datadogAgent:
  enabled: false

datadogDashboard:
  enabled: true

datadogMonitor:
  enabled: true
  
datadogSLO:
  enabled: false

remoteConfiguration:
  enabled: false


installCRDs: true

datadogCRDs:
  crds:
    datadogAgents: false
    datadogMetrics: false
    datadogPodAutoscalers: false
    datadogMonitors: true
    datadogDashboards: true
    datadogSLOs: false

collectOperatorMetrics: false

clusterRole:
  allowReadAllResources: true
  allowCreatePodsExec: false
@tbavelier
Copy link
Member

Hello @miguel-cardoso-mindera ,

Thank you for the report. We were able to reproduce internally and will be working on a fix/further identifying the root cause.

In the meantime, could you please try removing the empty list notifyList: [] from the YAML and let us know how it goes (a single dashboard should be created instead of 2) ? I believe the same issue might pop if templateVariables is set to [] but this is not the case of this dashboard, but it would be something to keep in mind for other dashboards.

@tbavelier tbavelier added the bug Something isn't working label Dec 11, 2024
@miguel-cardoso-mindera
Copy link
Author

Hey @tbavelier thanks for taking a look!

I can confirm removing notifyList: [] from the DatadogDashboard CRD fixes the duplicate issue and the error logs disappear.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants