Opencost grafana dashboard partly not working #2

andriktr · 2023-10-20T10:41:07Z

Hi,
I'm trying to use the following grafana https://github.com/opencost/opencost-helm-chart/blob/main/examples/dashboard/kube-prometheus-stack-opencost-dashboard.json dashboard to view opencost data. In my case it's only working partly and some boards throws the following errors:

Any thoughts?

Thanks in advance.

dwbrown2 · 2023-10-23T20:10:56Z

@mattray any chance you can take a look?

mattray · 2023-10-24T02:03:13Z

@andriktr I'll try to recreate and see what I find. I'm moving it over to the opencost-helm-chart repository since that's where it originated.

andriktr · 2023-10-24T04:52:33Z

@mattray Thanks a lot.

sossickd · 2023-10-25T10:02:38Z

I am also getting the same partial success with the grafana dashboard. If it helps the partial success in on a AWS EKS cluster and am not experiencing the issue on an Azure AKS cluster.

sossickd · 2023-11-07T11:05:55Z

Scape my last comment the dashboard is now partially working on an AKS cluster that was previously working

sossickd · 2023-11-14T12:31:00Z

Hi @mattray by the looks of things the project is really active so no how busy you must be but have you been able to replicate this issue. The dashboard is super useful when it was working so any progress on this would be useful

mattray · 2023-11-17T06:09:46Z

Sorry, I've been swamped on other projects. A couple of folks brought it up at KubeCon that they'd be interested in working on it, but I haven't heard from anyone else yet. I'm not a Grafana expert by any means, so if someone's interested and wants to take this feel free. I'll try to circle back to this soon.

sossickd · 2023-12-08T11:57:49Z

@dwbrown2 did you resolve a similar issue here?

kubecost/cost-analyzer-helm-chart#303

Could the same logic be applied?

dwbrown2 · 2023-12-11T23:40:40Z

@sossickd I'd have to dig in to share for sure. I'm unfortunately tied up on other projects right now, would love to extra help if others are able to review. Will do my best to circle back when free.

sossickd · 2023-12-12T01:49:33Z

@dwbrown2, @mattray OK found the issue, this was caused when running the opencost deployment with more than one replica.

Created a PR.

opencost/opencost-helm-chart#157

This adds a variable to filter on pod, amended each pane that had the many-to-many error and filtered on the pod label.

Not too sure if this is the best solution but fixed it in my case.

andriktr · 2023-12-12T09:59:17Z

Hmm... it sounds strange as I'm running opencost with single replica and still have same issues.

sossickd · 2023-12-12T10:05:15Z

@andriktr can you copy and paste one of the errors in a code snippet from one of the broken panes so i can see if its the same issue i was experiencing

andriktr · 2023-12-12T10:08:32Z

Errors I got are in the very first post of this issue.

sossickd · 2023-12-12T10:11:44Z

@andriktr would you mind pasting into a code snippet so i can copy easier?

andriktr · 2023-12-12T10:31:14Z

Sure:

Here is the error output for Top 20 by Namespace dashboard:

Status: 500. Message: execution: found duplicate series for the match group {instance="10.162.208.9:9003"} on the right hand-side of the operation: [{arch="amd64", container="opencost", endpoint="http", exported_instance="aks-default-29533205-vmss00000k", instance="10.162.208.9:9003", instance_type="Standard_D4s_v3", job="opencost", namespace="opencost", node="aks-default-29533205-vmss00000k", pod="opencost-858f6d4597-rr644", provider_id="azure:///subscriptions/00000000-0000-0000-0000-000000000000/resourceGroups/aks-nodes-west-dev/providers/Microsoft.Compute/virtualMachineScaleSets/aks-default-29533205-vmss/virtualMachines/20", region="westeurope", service="opencost"}, {arch="amd64", container="opencost", endpoint="http", exported_instance="aks-default-29533205-vmss00000i", instance="10.162.208.9:9003", instance_type="Standard_D4s_v3", job="opencost", namespace="opencost", node="aks-default-29533205-vmss00000i", pod="opencost-858f6d4597-rr644", provider_id="azure:///subscriptions/00000000-0000-0000-0000-000000000000/resourceGroups/aks-nodes-west-dev/providers/Microsoft.Compute/virtualMachineScaleSets/aks-default-29533205-vmss/virtualMachines/18", region="westeurope", service="opencost"}];many-to-many matching not allowed: matching labels must be unique on one side

sossickd · 2023-12-12T11:05:37Z

Can you open up prometheus and enter in this query:

topk( 30, 
  sum(sum(container_memory_allocation_bytes{namespace=~"$namespace"}) by (container,instance,pod) * on(instance) group_left() (
				node_ram_hourly_cost{pod=~".*opencost.*",pod=~"opencost-858f6d4597-rr644"} / 1024 / 1024 / 1024
				+ on(node,instance_type,pod) group_left()
					label_replace
					(
						kube_node_labels{}, "instance_type", "$1", "label_node_kubernetes_io_instance_type", "(.*)"
					) * 0
			)
  + 
  sum(container_cpu_allocation{namespace=~"$namespace"}) by (container,instance,pod) * on(instance) group_left() (
	  			node_cpu_hourly_cost{pod=~".*opencost.*",pod=~"opencost-858f6d4597-rr644"} + on(node,instance_type,pod) group_left()
		  			label_replace
		  			(
		  				kube_node_labels{}, "instance_type", "$1", "label_node_kubernetes_io_instance_type", "(.*)"
		  			) * 0
		  	)) by (container)
)

Do you get the many-to-many matching not allowed: matching labels must be unique on one side error?

andriktr · 2023-12-12T11:26:14Z

I receive no data actually for this particular query:

sossickd · 2023-12-12T11:34:51Z

OK looking at the error a bit further it looks like your issue maybe slightly different to mine.

[{arch="amd64", 
container="opencost", 
endpoint="http", 
exported_instance="aks-default-29533205-vmss00000k", 
instance="10.162.208.9:9003", 
instance_type="Standard_D4s_v3", 
job="opencost", 
namespace="opencost", 
node="aks-default-29533205-vmss00000k", 
pod="opencost-858f6d4597-rr644", 
provider_id="azure:///subscriptions/00000000-0000-0000-0000-000000000000/resourceGroups/aks-nodes-west-dev/providers/Microsoft.Compute/virtualMachineScaleSets/aks-default-29533205-vmss/virtualMachines/20", 
region="westeurope", 
service="opencost"}, 

{arch="amd64", 
container="opencost", 
endpoint="http", 
exported_instance="aks-default-29533205-vmss00000i",
instance="10.162.208.9:9003", 
instance_type="Standard_D4s_v3", 
job="opencost", 
namespace="opencost", 
node="aks-default-29533205-vmss00000i", 
pod="opencost-858f6d4597-rr644", 
provider_id="azure:///subscriptions/00000000-0000-0000-0000-000000000000/resourceGroups/aks-nodes-west-dev/providers/Microsoft.Compute/virtualMachineScaleSets/aks-default-29533205-vmss/virtualMachines/18", 
region="westeurope", 
service="opencost"}]

The query is returning tow matches from what i can see, the instance label looks like it might be being renamed to exported_instance, thats not happening for me. What doesn't look right to me is that the instance IP address is the same on both outputs.

Has the node recently been destroyed?

sossickd · 2023-12-12T19:49:57Z

@andriktr can you tun the following query in prometheus and paste the return in a code snippet.

node_ram_hourly_cost

Also can you show me the output from a kubectl get nodes -o wide on the AKS cluster

andriktr · 2023-12-13T14:38:25Z

Hi,
Node name for the opencost could change when opencost pod is rescheduled to another node for one or another reason i.e. cluster maintenance, workload rebalancig with descheduler, cluster upgrade, autoscaling etc.

And here is the error from same cluster

Status: 500. Message: execution: found duplicate series for the match group {instance="10.162.208.9:9003"} on the right hand-side of the operation: [{arch="amd64", container="opencost", endpoint="http", exported_instance="aks-default-29533205-vmss00000k", instance="10.162.208.9:9003", instance_type="Standard_D4s_v3", job="opencost", namespace="opencost", node="aks-default-29533205-vmss00000k", pod="opencost-858f6d4597-rr644", provider_id="azure:///subscriptions/000000-0000-0000-0000-00000000000/resourceGroups/aks-nodes-west-dev/providers/Microsoft.Compute/virtualMachineScaleSets/aks-default-29533205-vmss/virtualMachines/20", region="westeurope", service="opencost"}, {arch="amd64", container="opencost", endpoint="http", exported_instance="aks-default-29533205-vmss00000i", instance="10.162.208.9:9003", instance_type="Standard_D4s_v3", job="opencost", namespace="opencost", node="aks-default-29533205-vmss00000i", pod="opencost-858f6d4597-rr644", provider_id="azure:///subscriptions/000000-0000-0000-0000-00000000000/resourceGroups/aks-nodes-west-dev/providers/Microsoft.Compute/virtualMachineScaleSets/aks-default-29533205-vmss/virtualMachines/18", region="westeurope", service="opencost"}];many-to-many matching not allowed: matching labels must be unique on one side

Mentioned query returns the following:

sossickd · 2023-12-13T15:31:33Z

@andriktr OK this is different from what i am seeing. In my case the instance value matches the node value.

Are you using any relabelings of metrics? Can you determine what 10.162.208.9:9003 refers too? From what i can see 10.162.208.9 isn't a nodes ip address, is it the ip address of the opencost pod?

Also what version of opencost and helm chart are you using?

andriktr · 2023-12-13T19:47:56Z

Yes, 10.162.208.9:9003 is the opencost pod ip. Version is 1.107.0, but I saw same in earlier versions as well.

sossickd · 2023-12-13T20:48:48Z

@andriktr are you doing any relabelings of metrics? Trying to get my head around why you are getting a exported_instance label and the instance label is being transformed to 10.162.208.9:9003

If yu are using helm to deploy could you share your values?

It would also be useful to share you kube-prometheus-stack helm values

andriktr · 2023-12-18T19:40:06Z

Hey we do not do any relabelings of metrics. I have my own helm chart but in general it more less same as official + some addons for AAD Pod Identity:

serviceAccount:
  create: true
  annotations: {}
  # eks.amazonaws.com/role-arn: arn:aws:iam::123456789012:role/eksctl-opencost
  automountServiceAccountToken: true

annotations: {}
  #azure.workload.identity/inject-proxy-sidecar: "true"

service:
  annotations: {}
  labels: {}
  type: ClusterIP

opencost:
  exporter:
    # The GCP Pricing API requires a key. This is supplied just for evaluation.
    # cloudProviderApiKey: 'asdfasdfasdf'
    # Default cluster ID to use if cluster_id is not set in Prometheus metrics.
    defaultClusterId: "aks-experimental"
    image:
      registry: redacted
      repository: kubecost-cost-model
      tag: prod-1.107.0
    resources:
      requests:
        cpu: '10m'
        memory: '55M'
      limits:
        cpu: '999m'
        memory: '1G'
    extraEnv:
      {}
      # FOO: BAR

  metrics:
    serviceMonitor:
      enabled: true
      additionalLabels:
        release: 'kube-prometheus-stack'
      ## The label to use to retrieve the job name from.
      ## jobLabel: "app.kubernetes.io/name"
      namespace: 'kube-prometheus-stack'
      namespaceSelector: {}
      ## Default: scrape .Release.Namespace only
      ## To scrape all, use the following:
      ## namespaceSelector:
      ##   any: true
      scrapeInterval: 30s
      # honorLabels: true
      targetLabels: []
      relabelings: []
      metricRelabelings: []

  prometheus:
    # username:
    # password:
    external:
      enabled: false
      url: 'https://mimir-dev-push.infra.alto.com/prometheus'
    internal:
      enabled: true
      serviceName: kube-prometheus-stack-prometheus
      namespaceName: kube-prometheus-stack
      port: 9090

  ui:
    enabled: true
    image:
      registry: redacted
      repository: opencost-ui
      tag: prod-1.107.0
    resources:
      requests:
        cpu: '10m'
        memory: '55M'
      limits:
        cpu: '999m'
        memory: '1G'

  tolerations: []

# Baltic IF Custom Values
customAzureConfig: 
  enabled: true
  azureTenantId: "redacted"
  azureSubscriptionId: "redacted"

azurePodIdentity:
  enabled: true
  azureIdentity:
    name: opencost-identity
    resourceID: "redacted"
    clientID: "redacted"
  azureIdentityBinding:
    name: opencost-identity-binding
    selector: opencost-identity

azureWorkloadIdentity:
  enabled: false
  clientID: ""

ingress:
  enabled: true
  annotations: {}
  labels: {}
  ingress-class: internal-nginx
  hosts:
    - host: opencost-experimental.eu
      paths:
      - path: /
        pathType: ImplementationSpecific
        serviceName: opencost
        servicePort: 9090
  tls:
  - hosts:
    - opencost-experimental.eu
    secretName: ""

Dkaykay · 2024-02-13T13:39:14Z

Hey,
are there any updates on this issue? i am also seeing the "duplicate" issue in e.g. Top 20 by Namespace:

Status: 500. Message: execution: found duplicate series for the match group {instance_type="c5n.large", node="ip-10-250-0-163.eu-central-1.compute.internal"} on the right hand-side of the operation: [{cluster="cluster01-dmi", clusterType="dmi", instance="kube-state-metrics.monitoring.svc:8080", instance_type="c5n.large", job="kube-state-metrics", k8sType="aws", label_beta_kubernetes_io_arch="amd64", label_beta_kubernetes_io_instance_type="c5n.large", label_beta_kubernetes_io_os="linux", label_cluster_update_csi_sap_com_anytime="1", label_failure_domain_beta_kubernetes_io_region="eu-central-1", label_failure_domain_beta_kubernetes_io_zone="eu-central-1c", label_hana_cloud_workload_class_edge="1", label_kubernetes_io_arch="amd64", label_kubernetes_io_hostname="ip-10-250-0-163.eu-central-1.compute.internal", label_kubernetes_io_os="linux", label_networking_gardener_cloud_node_local_dns_enabled="false", label_node_kubernetes_io_instance_type="c5n.large", label_node_kubernetes_io_role="node", label_topology_ebs_csi_aws_com_zone="eu-central-1c", label_topology_kubernetes_io_region="eu-central-1", label_topology_kubernetes_io_zone="eu-central-1c", label_worker_garden_sapcloud_io_group="edge", label_worker_gardener_cloud_cri_name="containerd", label_worker_gardener_cloud_kubernetes_version="1.26.11", label_worker_gardener_cloud_pool="edge", landscape="cluster01", node="ip-10-250-0-163.eu-central-1.compute.internal", project="hc-dev", prometheus="monitoring/prometheus", region="eu-central-1"}, {cluster="cluster01-dmi", clusterType="dmi", container="opencost", endpoint="http", instance="100.96.10.107:9003", instance_type="c5n.large", job="opencost", k8sType="aws", label_beta_kubernetes_io_arch="amd64", label_beta_kubernetes_io_instance_type="c5n.large", label_beta_kubernetes_io_os="linux", label_cluster_update_csi_sap_com_anytime="1", label_failure_domain_beta_kubernetes_io_region="eu-central-1", label_failure_domain_beta_kubernetes_io_zone="eu-central-1c", label_hana_cloud_workload_class_edge="1", label_kubernetes_io_arch="amd64", label_kubernetes_io_hostname="ip-10-250-0-163.eu-central-1.compute.internal", label_kubernetes_io_os="linux", label_networking_gardener_cloud_node_local_dns_enabled="false", label_node_kubernetes_io_instance_type="c5n.large", label_node_kubernetes_io_role="node", label_topology_ebs_csi_aws_com_zone="eu-central-1c", label_topology_kubernetes_io_region="eu-central-1", label_topology_kubernetes_io_zone="eu-central-1c", label_worker_garden_sapcloud_io_group="edge", label_worker_gardener_cloud_cri_name="containerd", label_worker_gardener_cloud_kubernetes_version="1.26.11", label_worker_gardener_cloud_pool="edge", landscape="cluster01", namespace="opencost", node="ip-10-250-0-163.eu-central-1.compute.internal", pod="opencost-5f7549c7f8-8kxtf", project="hc-dev", prometheus="monitoring/prometheus", region="eu-central-1", service="opencost"}];many-to-many matching not allowed: matching labels must be unique on one side

I tested with 2 and with 1 replicas. Same result.

Any hints on how to solve / work around?

Thanks!

andriktr · 2024-02-20T14:55:33Z

Most probably main reason here is that opencost duplicates kube-state-metrics (uses same names) for it's metrics

To check u can try to simply search for kube_node_info metrics in grafana explorer you probably will see it doubled with additional instances related to opencost.

P.S. i have tried to adjust setting mentioned in https://www.opencost.io/docs/installation/helm#example-configuration

...
opencost:
  exporter:
    extraEnv:
      EMIT_KSM_V1_METRICS: "false"
      EMIT_KSM_V1_METRICS_ONLY: "true"

however for some reason this not worked, so I ended up with uninstalling opencost and switching to aks-cost-analysis addon which is actually also based on opencost :)

dlahn · 2024-04-18T17:41:13Z

Does anyone have any update here? We are facing a similar issue.

asdfgugus · 2024-04-19T10:06:05Z

Does anyone have any update here? We are facing a similar issue.

I was able to fix the issue with these settings:

opencost:
  metrics:
    kubeStateMetrics:
      emitKsmV1Metrics: false
      emitKsmV1MetricsOnly: true

I deployed the changes and waited an hour. After, I changed the time range on the dashboard to 15min in order to see if the changes work. However the dashboard uses a large fixed time range for some visualizations. These visualizations take some time until they show you the correct data. But some of the them should already work.

AjayTripathy · 2024-04-19T16:33:52Z

Is there a clear description of which dashboards don't work somewhere? Would love any community support on this.

asdfgugus · 2024-04-21T18:28:10Z

@dlahn there are two different issues here:

duplicate metric series
labels with prefix "exported_"

For the first issue, I mentioned the fix above.
For the second issue you need to check your scrape job. In my case, I use the vmagent and needed to honor the labels from opencost. Otherwise labels like namespace get renamed to exported_namespace.

It seems to me that we need to document this somewhere.
@mattray any suggestions?

mattray · 2024-04-22T08:38:14Z

We can put configuration/work-arounds notes in the README

asdfgugus · 2024-04-22T08:51:06Z

We can put configuration/work-arounds notes in the README

Sounds good! I will add it to the draft PR.

dlahn · 2024-04-24T15:04:35Z

@asdfgugus We have had this change for quite some time, but we are still running into this issue.

        - name: EMIT_KSM_V1_METRICS
          value: 'false'
        - name: EMIT_KSM_V1_METRICS_ONLY
          value: 'true'

We have also made sure to drop out the exported_ labels. The instance is unique across all of these metrics.

dlahn · 2024-04-24T18:09:10Z

@asdfgugus Further to the above...

Should the instance be unique to the opencost pod? At the momemt, we are using k8s-monitoring-helm and it sets the instance to be the same for the opencost scrape.

However.. looking at the Top 20 Namespaces part of the dashboard as an example:

sum(container_memory_allocation_bytes) by (namespace,instance) * on(instance) group_left() (
				node_ram_hourly_cost{} / 1024 / 1024 / 1024 * 730
				+ on(node,instance_type) group_left()
					label_replace
					(
						kube_node_labels{}, "instance_type", "$1", "label_node_kubernetes_io_instance_type", "(.*)"
					) * 0
			)

The 2nd part where it looks up node_ram_hourly_cost is going to return multiple results because there are multiple nodes. Am I missing something here? Maybe we are very confused.

dlahn · 2024-04-25T16:39:43Z

Just an update here, our issue was that the instance label was being re-written by k8s-monitoring-helm, so they were all the same across all of the opencost metrics being scraped. This is why we also ended up with exported_instance labels. We re-wrote these back to the correct value for instance, and now our dashboard is working!

asdfgugus · 2024-04-29T08:46:05Z

Just an update here, our issue was that the instance label was being re-written by k8s-monitoring-helm, so they were all the same across all of the opencost metrics being scraped. This is why we also ended up with exported_instance labels. We re-wrote these back to the correct value for instance, and now our dashboard is working!

Thanks for sharing your solution! As we collaborated on debugging via Slack, I'd like to expand on it. It is crucial to honor the labels, as I mentioned earlier. By honoring the labels, I mean ensuring that the scrape job does not append the exported_ prefix. Essentially, this involves retaining the original source labels (e.g., instance) instead of dropping them and avoiding the addition of renamed labels (e.g. exported_instance).

@dlahn do you re-write them when scraping or querying?

dlahn · 2024-05-02T13:31:20Z

@asdfgugus I am re-writing at the scrape side.

dlahn · 2024-05-06T23:57:34Z

For anyone using k8s-monitoring-helm who may run into this issue, a fix has been made in the chart to add the honor_labels for the instance to make this work. grafana/k8s-monitoring-helm#514

dholeshu · 2024-06-18T12:18:21Z

Im still seeing the duplicate issue, even with

        - name: EMIT_KSM_V1_METRICS
          value: 'false'
        - name: EMIT_KSM_V1_METRICS_ONLY
          value: 'true'

Would it be possible to disable certain metrics that are duplicated using value in values.yaml?

    disabledMetrics:
      - <metric-to-be-disabled>
      - <metric-to-be-disabled>

This was also mentioned in opencost/opencost#1571

How to identify the duplicated metrics?

Momotoculteur · 2024-07-02T15:22:23Z

Hello guys,

same issue for me, i can't use your dashboard cause of duplicate metrics. Even with your 2 env ver about ksmV1.

asdfgugus · 2024-07-02T17:34:56Z

@Momotoculteur, could you please check which metrics are affected?
Furthermore, could you provide more information about your setup?

Do you use a dedicated metrics store for OpenCost?
If not, which other services are storing metrics in the same store?
Do you use OpenCost on multiple clusters with one shared metrics store?
...

asdfgugus · 2024-07-02T17:54:25Z

@dholeshu

Would it be possible to disable certain metrics that are duplicated using value in values.yaml?
    disabledMetrics:
      - <metric-to-be-disabled>
      - <metric-to-be-disabled> 

Yes, you can disable metrics of OpenCost. When I remember correctly, the current dashboard only uses metrics produced by OpenCost. Therefore, I would first identify the duplicated metrics.
Do you deploy OpenCost multiple times?
Which metrics store are you using?

How to identify the duplicated metrics?

Edit the dashboard and check which queries are not working. You can also query the metrics store directly for these metrics: https://docs.kubecost.com/v/1.0x/architecture/user-metrics

Momotoculteur · 2024-07-03T12:45:03Z

@Momotoculteur, could you please check which metrics are affected? Furthermore, could you provide more information about your setup?

Do you use a dedicated metrics store for OpenCost?

If not, which other services are storing metrics in the same store?

Do you use OpenCost on multiple clusters with one shared metrics store?

...

Hello @asdfgugus thanks for your quick answer.

I have this setup:

Running on EKS with 1.28 kubernetes
Kubestatemetrics in latest version
Opencost is latest version

Metrics endpoints from KSM & opencost are scrapped via Vector and sended in a grafana Mimir TSDB in AWS S3 buckets

I scrapped others services like cAdvisor, metrics-server, custom Jenkins metrics for betclic company via a prometheus push gateway, node-exporter, nginx exporter, jfrogArtifactory,and other stuff but i think we do not care about these apps.

All is deployed via helm chart via ArgoCD.

I have tested 3 differents dashboard, but got same results :

KubeCost from grafana marketplace
Opencost from your github org, the basic one
Opencost from your github org, the detailled one

The error is : Status: 422. Message: execution: found duplicate series for the match group

Need extra information about specific metrics which cause issue on specific dashboards ?

I try to setup opencost to expose none KSM metrics as i have already one which expose metrics needed for opencost like this :

kubecostMetrics:
  emitKsmV1Metrics: false
  emitKsmV1MetricsOnly: false

I have also tested this setup following some previous tips, but that doesn't fix my problem

kubecostMetrics:
  emitKsmV1Metrics: false
  emitKsmV1MetricsOnly: true

Last idea i have is to let emitKsmV1MetricsOnly: true and comment mine from my own KSMv2 to expose thats metrics https://docs.kubecost.com/architecture/ksm-metrics#ksm-metrics-emitted-by-kubecost, but currently that seem to not work as OpenCost need some metrics in V1 format..

asdfgugus · 2024-07-03T21:08:36Z

Thanks @Momotoculteur for the details.

Careful, this is the configuration for the official OpenCost Helm chart:

opencost:
  metrics:
    kubeStateMetrics:
      emitKsmV1Metrics: false
      emitKsmV1MetricsOnly: true

Btw. you can find the Helm chart on ArtifactHub: https://artifacthub.io/packages/helm/opencost/opencost

Momotoculteur · 2024-07-03T21:51:12Z

Thanks @Momotoculteur for the details.

Careful, this is the configuration for the official OpenCost Helm chart:
opencost:
  metrics:
    kubeStateMetrics:
      emitKsmV1Metrics: false
      emitKsmV1MetricsOnly: true
Btw. you can find the Helm chart on ArtifactHub: https://artifacthub.io/packages/helm/opencost/opencost

I have exactly this configuration, sorry my copy/paste was wrong :)

Edit: i try tonight to desactivate KSMv2 metrics which is emit already by OpenCost (and described in kubecost documentation) in order to avoid duplicate, but still have the issue i'm clearly lost now why i got that problem :(

Momotoculteur · 2024-07-09T08:46:51Z

My duplicate metrics which cause this issues gave me that values :

[
    {__name__="node_cpu_hourly_cost", arch="amd64", instance="IP_XXX", instance_type="c6a.xlarge", node="IP_XXX", provider_id="aws:///eu-west-1", region="eu-west-1"
    },
    {__name__="node_cpu_hourly_cost", arch="amd64", instance="IP_XXX", instance_type="c6a.xlarge", node="IP_XXX", provider_id="aws:///eu-west-1", region="eu-west-1"
    }
]

i used Karpenter for node autoscaling and spot instance from AWS

EDIT : I made some tests, in order to delete duplicate items from my left_join PROMQL request. I'm based on min, or max or avg funtion, and seems to works now

Request from your dashboard:

sum by(namespace, container) (container_cpu_allocation * on (node) group_left node_cpu_hourly_cost

Updated request :

sum by(namespace, container) (container_cpu_allocation * on (node) group_left avg(node_cpu_hourly_cost) by (node))

asdfgugus · 2024-07-11T11:17:25Z

@Momotoculteur
I am glad, you found a solution that works for you! We would love to see a contribution for that.
Do you always have duplicate metrics or does this only happen in a specific case (e.g. when a rollout of OpenCost gets triggered)?

Momotoculteur · 2024-07-11T11:36:39Z

@asdfgugus

I've only 14 days of data cause i just recently installed open cost helm chart. But i have no more problem now even with a large timestamp in my dashboards.

mattray transferred this issue from opencost/opencost Oct 24, 2023

sossickd mentioned this issue Dec 16, 2023

draft: Fix dashboard, add variable for pod and amend each expression to filter on pod. opencost/opencost-helm-chart#157

Closed

mattray transferred this issue from opencost/opencost-helm-chart Apr 1, 2024

dlahn mentioned this issue May 2, 2024

Opencost UI no values grafana/k8s-monitoring-helm#490

Closed

github-actions bot added the needs-follow-up label Jun 18, 2024

Opencost grafana dashboard partly not working #2

Opencost grafana dashboard partly not working #2

Comments

andriktr commented Oct 20, 2023

dwbrown2 commented Oct 23, 2023

mattray commented Oct 24, 2023

andriktr commented Oct 24, 2023

sossickd commented Oct 25, 2023

sossickd commented Nov 7, 2023

sossickd commented Nov 14, 2023 • edited Loading

mattray commented Nov 17, 2023

sossickd commented Dec 8, 2023 • edited Loading

dwbrown2 commented Dec 11, 2023

sossickd commented Dec 12, 2023

andriktr commented Dec 12, 2023

sossickd commented Dec 12, 2023

andriktr commented Dec 12, 2023

sossickd commented Dec 12, 2023

andriktr commented Dec 12, 2023

sossickd commented Dec 12, 2023

andriktr commented Dec 12, 2023

sossickd commented Dec 12, 2023 • edited Loading

sossickd commented Dec 12, 2023 • edited Loading

andriktr commented Dec 13, 2023

sossickd commented Dec 13, 2023 • edited Loading

andriktr commented Dec 13, 2023

sossickd commented Dec 13, 2023 • edited Loading

andriktr commented Dec 18, 2023

Dkaykay commented Feb 13, 2024 • edited Loading

andriktr commented Feb 20, 2024

dlahn commented Apr 18, 2024

asdfgugus commented Apr 19, 2024

AjayTripathy commented Apr 19, 2024

asdfgugus commented Apr 21, 2024

mattray commented Apr 22, 2024

asdfgugus commented Apr 22, 2024

dlahn commented Apr 24, 2024

dlahn commented Apr 24, 2024

dlahn commented Apr 25, 2024

asdfgugus commented Apr 29, 2024 • edited Loading

dlahn commented May 2, 2024

dlahn commented May 6, 2024

dholeshu commented Jun 18, 2024 • edited Loading

Momotoculteur commented Jul 2, 2024

asdfgugus commented Jul 2, 2024

asdfgugus commented Jul 2, 2024

Momotoculteur commented Jul 3, 2024 • edited Loading

asdfgugus commented Jul 3, 2024

Momotoculteur commented Jul 3, 2024 • edited Loading

Momotoculteur commented Jul 9, 2024 • edited Loading

asdfgugus commented Jul 11, 2024

Momotoculteur commented Jul 11, 2024

sossickd commented Nov 14, 2023 •

edited

Loading

sossickd commented Dec 8, 2023 •

edited

Loading

sossickd commented Dec 12, 2023 •

edited

Loading

sossickd commented Dec 12, 2023 •

edited

Loading

sossickd commented Dec 13, 2023 •

edited

Loading

sossickd commented Dec 13, 2023 •

edited

Loading

Dkaykay commented Feb 13, 2024 •

edited

Loading

asdfgugus commented Apr 29, 2024 •

edited

Loading

dholeshu commented Jun 18, 2024 •

edited

Loading

Momotoculteur commented Jul 3, 2024 •

edited

Loading

Momotoculteur commented Jul 3, 2024 •

edited

Loading

Momotoculteur commented Jul 9, 2024 •

edited

Loading