found duplicate series for the match group {ceph_daemon=\"osd.0\"} in multi-clusters ceph deployment

Bug #1990248 reported by Chi Wai CHAN
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Ceph Dashboard Charm
Fix Committed
Undecided
Chi Wai CHAN

Bug Description

In a two(2) Datacenter customer environment we have deployed two(2) independent Ceph clusters, one on each Datacenter, we are trying to use Grafana to show the Ceph metrics

In the "[juju-openstack] / [juju] OSD device details" dashboard we can see the following error

---
"found duplicate series for the match group {ceph_daemon=\"osd.0\"} on the right hand-side of the operation: [{ceph_daemon=\"osd.0\", dns_name=\"juju-08a910-24-lxd-1.domain\", group=\"promoagents-juju\", instance=\"172.24.12.136:9283\", job=\"remote-6958e464317040928aab296791a550e5\"}, {ceph_daemon=\"osd.0\", dns_name=\"juju-08a910-2-lxd-0..domain\", group=\"promoagents-juju\", instance=\"172.24.12.78:9283\", job=\"remote-2346cc18b30146f38b116257e4436213\"}];many-to-many matching not allowed: matching labels must be unique on one side"
---

Labels are not being unique because OSD disk names are repeated in both clusters, in this case osd.0

A workaround for this could be to name OSD devices differently in both clusters, let's say osd.{0..n} in one cluster and osd{n+1..m} in the other cluster. But it will be good to fix this by making labels unique in some way

These are the relations added
---
# ceph-osd
ceph-osd-atm-zone1:juju-info prometheus-grok-exporter:juju-info juju-info subordinate
ceph-osd-dic-zone1:juju-info prometheus-grok-exporter:juju-info juju-info subordinate

# prometheus
prometheus-libvirt-exporter:dashboards grafana:dashboards grafana-dashboard regular
prometheus-openstack-exporter:dashboards grafana:dashboards grafana-dashboard regular

# grafana
ceph-dashboard-atm-zone1:grafana-dashboard grafana:dashboards grafana-dashboard regular
ceph-dashboard-dic-zone1:grafana-dashboard grafana:dashboards grafana-dashboard regular
prometheus-libvirt-exporter:dashboards grafana:dashboards grafana-dashboard regular
prometheus-openstack-exporter:dashboards grafana:dashboards grafana-dashboard regular
---

copying from https://bugs.launchpad.net/charm-prometheus-grok-exporter/+bug/1976528, I think here is a better place for this bug.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to charm-ceph-dashboard (master)
Changed in charm-ceph-dashboard:
status: New → In Progress
Chi Wai CHAN (raychan96)
Changed in charm-ceph-dashboard:
assignee: nobody → Chi Wai CHAN (raychan96)
Revision history for this message
Chi Wai CHAN (raychan96) wrote :

Hi, I am attaching some example screenshots to show that adding a "job" filter can help resolving: "found duplicate series for the match group {ceph_daemon=\"osd.0\"}" in multi-clusters ceph deployment.

In the attachment, I tried to mimic the conditions of "multi-clusters ceph deployment" by deploying two ceph clusters with different application names, and same error showed up. Adding a "job" filter can help distinguishing the ambiguity, and properly display the information of two ceph clusters independently.

Revision history for this message
Chi Wai CHAN (raychan96) wrote :

More screenshots for the patch

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to charm-ceph-dashboard (master)

Reviewed: https://review.opendev.org/c/openstack/charm-ceph-dashboard/+/858874
Committed: https://opendev.org/openstack/charm-ceph-dashboard/commit/0e604abccb9bfdb0bbf4464e9817975bac53b7dd
Submitter: "Zuul (22348)"
Branch: master

commit 0e604abccb9bfdb0bbf4464e9817975bac53b7dd
Author: Chi Wai, Chan <email address hidden>
Date: Fri Nov 4 14:04:48 2022 +0800

    Add job matcher. This allows query to distinguish between ceph clusters
    using job label.

    Closes-Bug: #1990248
    Change-Id: I8c14d6ab03fab3830d6da632b5dec1065d9068b2

Changed in charm-ceph-dashboard:
status: In Progress → Fix Committed
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.