Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bump kube-prometheus-stack to 45.5.0 #4017

Merged

Conversation

gdemonet
Copy link
Contributor

@gdemonet gdemonet commented Mar 7, 2023

Update the charts with:

helm repo add prometheus-community https://prometheus-community.github.io/helm-charts
helm repo update
rm -rf charts/kube-prometheus-stack
helm fetch -d charts --untar prometheus-community/kube-prometheus-stack

Also bump a number of components:

  • Prometheus to 2.42.0
  • Thanos to 0.30.2
  • Grafana to 9.3.8
  • kiwigrid/k8s-sidecar to 1.22.3
  • kube-state-metrics to 2.8.0
  • node-exporter to 1.5.0
  • prometheus-operator to 0.63.0

The chart.sls was re-rendered with:

./doit.sh codegen:chart_kube-prometheus-stack

Since we bumped Thanos, we also re-render its own chart (without
updating, since it did not change since the last update) with:

./doit.sh codegen:chart_thanos

Important note: Alertmanager configuration was updated by hand,
and a note was added to try and remind maintainers of doing it in the
future. This should help making the "InfoInhibitor" alert more useful.

@gdemonet gdemonet requested a review from a team as a code owner March 7, 2023 11:30
@bert-e
Copy link
Contributor

bert-e commented Mar 7, 2023

Hello gdemonet,

My role is to assist you with the merge of this
pull request. Please type @bert-e help to get information
on this process, or consult the user documentation.

Status report is not available.

@bert-e
Copy link
Contributor

bert-e commented Mar 7, 2023

Waiting for approval

The following approvals are needed before I can proceed with the merge:

  • the author

  • one peer

Peer approvals must include at least 1 approval from the following list:

@gdemonet gdemonet changed the title Bump kube-prometheus and thanos Bump kube-prometheus-stack to 45.5.0 Mar 7, 2023
@gdemonet gdemonet added the kind:dependencies Pull requests that update a dependency file label Mar 7, 2023
@gdemonet gdemonet force-pushed the improvement/bump-kube-prometheus-and-thanos branch from eec4175 to 4e89c73 Compare March 7, 2023 11:35
@gdemonet gdemonet marked this pull request as draft March 7, 2023 11:36
@gdemonet gdemonet force-pushed the improvement/bump-kube-prometheus-and-thanos branch from 4e89c73 to e7a6ac9 Compare March 7, 2023 11:44
Copy link
Collaborator

@TeddyAndrieux TeddyAndrieux left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Really really hard to review

Just by looking at the changes on the salt side it looks good to me (except for the upgrade handling)

Comment on lines -15 to +20
repository: '__image__(alertmanager)'
registry: '__var__(repo.registry_endpoint)'
repository: '__image_no_reg__(alertmanager)'
Copy link
Collaborator

@TeddyAndrieux TeddyAndrieux Mar 7, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sad, but ok, yes it's needed

But cannot we set this registry only once in global ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right will do, found out too late about this one 😇

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Argh no I can't do that because the kube-state-metrics subchart doesn't use this value correctly:

image: "{{ .Values.global.imageRegistry | default .Values.image.repository }}:{{ .Values.image.tag | ...

😭

I'll submit a PR over there to have it fixed

charts/kube-prometheus-stack/README.md Show resolved Hide resolved
CHANGELOG.md Outdated Show resolved Hide resolved
CHANGELOG.md Outdated Show resolved Hide resolved
@bert-e

This comment was marked as resolved.

Some charts now expect the image registry to be defined separately from
the repository, and enforces that these values are joined with a slash.
This causes issues with our macro "build_image_name", which builds the
whole path.
We add an option to this macro to omit the registry endpoint, and make
this value available to rendered charts using the charts/render.py
script header.
@gdemonet gdemonet force-pushed the improvement/bump-kube-prometheus-and-thanos branch from e7a6ac9 to 5b44514 Compare March 8, 2023 11:05
@bert-e
Copy link
Contributor

bert-e commented Mar 8, 2023

Waiting for approval

The following approvals are needed before I can proceed with the merge:

  • the author

  • one peer

Peer approvals must include at least 1 approval from the following list:

@gdemonet gdemonet force-pushed the improvement/bump-kube-prometheus-and-thanos branch from 5b44514 to 9680238 Compare March 8, 2023 13:58
Comment on lines +1081 to +1084
# Drop cgroup metrics with no pod.
- sourceLabels: [id, pod]
action: drop
regex: '.+;'
Copy link
Contributor Author

@gdemonet gdemonet Mar 8, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This causes the current prometheus-adapter resourceRules.(cpu|memory).nodeQuery to not hit anything, because we dropped the metrics for nodes (pod="", id="/").

It appears the goal is to now rely on more efficient node-exporter metrics (see kubernetes-sigs/prometheus-adapter#516 and prometheus-community/helm-charts#2827), but we're now hitting an issue with labels: our node-exporter metrics don't have a "node" label, and that's not good when trying to map to /metrics.k8s.io/v1beta1/nodes.

We will need to fix this label issue (TBH, would be much simpler to explore and query, e.g. from our UI), but I'm not sure how involved this will be. For now, I'm considering a temporary workaround by keeping these metrics around, with a follow-up ticket.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here's the follow-up #4018

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And the workaround: 97ce7c6

@gdemonet gdemonet marked this pull request as ready for review March 8, 2023 21:11
@gdemonet
Copy link
Contributor Author

gdemonet commented Mar 8, 2023

/approve

@bert-e
Copy link
Contributor

bert-e commented Mar 8, 2023

Waiting for approval

The following approvals are needed before I can proceed with the merge:

  • the author

  • one peer

Peer approvals must include at least 1 approval from the following list:

The following options are set: approve

@gdemonet gdemonet force-pushed the improvement/bump-kube-prometheus-and-thanos branch from 734433c to b52b002 Compare March 9, 2023 09:12
Copy link
Collaborator

@TeddyAndrieux TeddyAndrieux left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:shipit:

@bert-e
Copy link
Contributor

bert-e commented Mar 9, 2023

Waiting for approval

The following approvals are needed before I can proceed with the merge:

  • the author

  • one peer

Peer approvals must include at least 1 approval from the following list:

The following reviewers are expecting changes from the author, or must review again:

The following options are set: approve

Update the charts with:

```
helm repo add prometheus-community https://prometheus-community.github.io/helm-charts
helm repo update
rm -rf charts/kube-prometheus-stack
helm fetch -d charts --untar prometheus-community/kube-prometheus-stack
```

Also bump a number of components:
- Prometheus to 2.42.0
- Thanos to 0.30.2
- Grafana to 9.3.8
- kiwigrid/k8s-sidecar to 1.22.3
- kube-state-metrics to 2.8.0
- node-exporter to 1.5.0
- prometheus-operator to 0.63.0

The chart.sls was re-rendered with:

```
./doit.sh codegen:chart_kube-prometheus-stack
```

Since we bumped Thanos, we also re-render its own chart (without
updating, since it did not change since the last update) with:

```
./doit.sh codegen:chart_thanos
```

Important note: Alertmanager configuration was updated by hand,
and a note was added to try and remind maintainers of doing it in the
future. This should help making the "InfoInhibitor" alert more useful.
This changes the default configuration from kube-prometheus-stack since
we still use these metrics in prometheus-adapter.
Ideally, we would rather let prometheus-adapter consume node-exporter
metrics, but this requires #4018 to be fixed first.
Had a flaky on this (failed on single-node but multi-nodes succeeded),
let's wait a bit longer.
@gdemonet gdemonet force-pushed the improvement/bump-kube-prometheus-and-thanos branch from b52b002 to b6e5ba9 Compare March 9, 2023 10:47
@bert-e
Copy link
Contributor

bert-e commented Mar 9, 2023

Build failed

The build for commit did not succeed in branch improvement/bump-kube-prometheus-and-thanos.

The following options are set: approve

@bert-e
Copy link
Contributor

bert-e commented Mar 9, 2023

In the queue

The changeset has received all authorizations and has been added to the
relevant queue(s). The queue(s) will be merged in the target development
branch(es) as soon as builds have passed.

The changeset will be merged in:

  • ✔️ development/125.0

The following branches will NOT be impacted:

  • development/123.0
  • development/124.0
  • development/124.1
  • development/2.0
  • development/2.1
  • development/2.10
  • development/2.11
  • development/2.2
  • development/2.3
  • development/2.4
  • development/2.5
  • development/2.6
  • development/2.7
  • development/2.8
  • development/2.9

There is no action required on your side. You will be notified here once
the changeset has been merged. In the unlikely event that the changeset
fails permanently on the queue, a member of the admin team will
contact you to help resolve the matter.

IMPORTANT

Please do not attempt to modify this pull request.

  • Any commit you add on the source branch will trigger a new cycle after the
    current queue is merged.
  • Any commit you add on one of the integration branches will be lost.

If you need this pull request to be removed from the queue, please contact a
member of the admin team now.

The following options are set: approve

@bert-e
Copy link
Contributor

bert-e commented Mar 9, 2023

I have successfully merged the changeset of this pull request
into targetted development branches:

  • ✔️ development/125.0

The following branches have NOT changed:

  • development/123.0
  • development/124.0
  • development/124.1
  • development/2.0
  • development/2.1
  • development/2.10
  • development/2.11
  • development/2.2
  • development/2.3
  • development/2.4
  • development/2.5
  • development/2.6
  • development/2.7
  • development/2.8
  • development/2.9

Please check the status of the associated issue None.

Goodbye gdemonet.

@bert-e bert-e merged commit b6e5ba9 into development/125.0 Mar 9, 2023
@bert-e bert-e deleted the improvement/bump-kube-prometheus-and-thanos branch March 9, 2023 16:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind:dependencies Pull requests that update a dependency file
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants