Losing k8s-master causes prometheus manual-jobs duplicate
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
Charm Helpers |
Fix Released
|
Undecided
|
Unassigned | ||
Kubernetes Control Plane Charm |
Triaged
|
Medium
|
Unassigned | ||
Prometheus2 charm |
Triaged
|
Wishlist
|
Unassigned |
Bug Description
If the kubernetes-master leader unit is lost, the new leader provides an additional set of relation data to prometheus without removing the data from the original leader. This causes duplicate manual-jobs in prometheus, in turn causing queries to produce duplicate results. This appears to be in how prometheus attempts to prevent job names from clashing by using the request_id in the job name [1], but the new leader is using a new request_id, thus not being marked as a duplicate of the original job.
kubernetes-master 1.17.12 active 3 kubernetes-master local 0 ubuntu
kubernetes-worker 1.17.12 active 4 kubernetes-worker local 0 ubuntu exposed
prometheus active 1 prometheus2 local 0 ubuntu
1) Deploy kubernetes-master in HA with 3 units and add relation with prometheus as documented. I used hacluster, I haven't tested with other methods.
2) Restart prometheus to get around bug #1891942 [2]:
juju ssh prometheus/0 "sudo systemctl restart snap.prometheus
3) grep 'job_name' from the prometheus config:
juju ssh prometheus/0 "sudo grep 'job_name' /var/snap/
Note that there is one instance of each of:
"job_name": "kube-state-
"job_name": "k8s-api-
"job_name": "kubernetes-
"job_name": "kube-state-
"job_name": "kubernetes-
Record the UUID for each.
3) Run this command and notice that only the kubernetes-master leader unit provides data:
for rid in $(juju run -u prometheus/0 'relation-ids manual-jobs'); do echo $rid; for runit in $(juju run -u prometheus/0 "relation-list -r $rid"); do echo $rid - $runit; juju run -u prometheus/0 "relation-get -r $rid - $runit"; done; done
4) Stop the jujud kubernetes-master service on the leader:
juju ssh kubernetes-master/1 "sudo systemctl stop jujud-unit-
5) Allow things to settle and a new leader to be elected
6) Restart prometheuse to get around bug #1891942 [1]:
juju ssh prometheus/0 "sudo systemctl restart snap.prometheus
7) grep 'job_name' from the prometheus config:
juju ssh prometheus/0 "sudo grep 'job_name' /var/snap/
Note that there are now two instances of each of:
"job_name": "kube-state-
"job_name": "k8s-api-
"job_name": "kubernetes-
"job_name": "kube-state-
"job_name": "kubernetes-
8) Run this command and notice that there are now two kubernetes-master units which provide data:
for rid in $(juju run -u prometheus/0 'relation-ids manual-jobs'); do echo $rid; for runit in $(juju run -u prometheus/0 "relation-list -r $rid"); do echo $rid - $runit; juju run -u prometheus/0 "relation-get -r $rid - $runit"; done; done
This causes a number of queries in prometheus to return 2x results since there are now 2x scrapes.
[1] https:/
[2] https:/
Changed in charm-prometheus2: | |
status: | New → Triaged |
importance: | Undecided → Wishlist |
tags: | added: sts |
Thanks for the detailed report and reproduction steps!
I think this will need to be fixed in the prometheus charm. The code that injects the UUID[1] is part of the to_json function that's invoked by prometheus[2]. This is supposed to "ensure uniqueness" i.e. prevent multiple requests from clashing. A side effect of this is that when different units request the same job, the prometheus charm will generate multiple copies of that job.
The request ID that's used is part of the request-response pattern from charms.reactive. It is stored in relation data, and relation data is unit-scoped. Updating that pattern to allow re-use of a request ID across units would be doable but challenging. I strongly believe that, instead, this is a case where the request ID is being misused by the receiving side. Perhaps the relation ID should be used instead, which seems to be unique per application?
I am adding prometheus to this issue, but also leaving it open against kubernetes-master, since I would like another engineer from our team to look at this.
[1]: https:/ /github. com/juju- solutions/ interface- prometheus- manual/ blob/3f775242c1 6d53243c993d7ba 0c896169ad1639e /common. py#L36 /git.launchpad. net/charm- prometheus2/ tree/src/ reactive/ prometheus. py#n748
[2]: https:/