Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[kube-prometheus-stack] Allow Overriding or Disabling Default Grafana Dashboards #4920

Open
vladmalynych opened this issue Oct 17, 2024 · 0 comments
Labels
enhancement New feature or request

Comments

@vladmalynych
Copy link

Is your feature request related to a problem ?

Problem:

The Grafana dashboard defined in charts/kube-prometheus-stack/templates/grafana/dashboards-1.14/k8s-resources-pod.yaml includes a graph for memory consumption per pod. The memory consumption query is currently sourced from a different repository (https://github.com/prometheus-operator/kube-prometheus):

https://github.com/prometheus-operator/kube-prometheus/blob/main/manifests/grafana-dashboardDefinitions.yaml#L8300

                  "targets": [
                      {
                          "datasource": {
                              "type": "prometheus",
                              "uid": "${datasource}"
                          },
                          "expr": "sum(container_memory_working_set_bytes{job=\"kubelet\", metrics_path=\"/metrics/cadvisor\", cluster=\"$cluster\", namespace=\"$namespace\", pod=\"$pod\", container!=\"\", image!=\"\"}) by (container)",
                          "legendFormat": "__auto"
                      },
                      {
                          "datasource": {
                              "type": "prometheus",
                              "uid": "${datasource}"
                          },
                          "expr": "sum(\n    kube_pod_container_resource_requests{job=\"kube-state-metrics\", cluster=\"$cluster\", namespace=\"$namespace\", pod=\"$pod\", resource=\"memory\"}\n)\n",
                          "legendFormat": "requests"
                      },
                      {
                          "datasource": {
                              "type": "prometheus",
                              "uid": "${datasource}"
                          },
                          "expr": "sum(\n    kube_pod_container_resource_limits{job=\"kube-state-metrics\", cluster=\"$cluster\", namespace=\"$namespace\", pod=\"$pod\", resource=\"memory\"}\n)\n",
                          "legendFormat": "limits"
                      }
                  ],
                  "title": "Memory Usage (WSS)",
                  "type": "timeseries"
              },

When a pod is restarted, the current query combines memory usage data from both the old and new containers, leading to temporary spikes in displayed memory consumption. This can cause the dashboard to show memory usage exceeding the container's memory limit, even though actual usage remains within limits.

Screenshot 2024-09-17 at 14 20 49
Screenshot 2024-09-17 at 14 22 55 (1)

Steps to Reproduce:

  • Trigger a pod restart (e.g OOM kill, or Evict).
  • Compare graphs with expression grouped by just container field with graph that has expression that groups by container and id:
"expr": "sum(container_memory_working_set_bytes{job=\"kubelet\", metrics_path=\"/metrics/cadvisor\", cluster=\"$cluster\", namespace=\"$namespace\", pod=\"$pod\", container!=\"\", image!=\"\"}) by (container, id)"

Describe the solution you'd like.

Possible solution:

Would it be possible to add functionality in Helm to allow default charts to be overwritten or patched?

Describe alternatives you've considered.

Alternative solutions:

Would it be possible to add functionality in Helm to disable specific dashboards ?

Additional context.

This issue is also rised in prometheus-operator/kube-prometheus#2522

@vladmalynych vladmalynych added the enhancement New feature or request label Oct 17, 2024
@vladmalynych vladmalynych changed the title Incorrect Container Memory Consumption Graph Behavior When Pod is Restarted Allow Overriding or Disabling Default Grafana Dashboards in kube-prometheus-stack Helm Oct 17, 2024
@zeritti zeritti changed the title Allow Overriding or Disabling Default Grafana Dashboards in kube-prometheus-stack Helm [kube-prometheus-stack] Allow Overriding or Disabling Default Grafana Dashboards Oct 17, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant