RBD mirror tests are flaky

Bug #1982043 reported by Luciano Lo Giudice
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Kernel SRU Workflow
New
Undecided
Unassigned

Bug Description

Some RBD mirror tests are failing intermittently. The issue is actually in the ceph-mon charm, specifically the osd-relation entry point. The call to `notify_rbd_mirrors()` sometimes fail with the following traceback:

```
2022-07-15 15:10:06 INFO unit.ceph-mon-b/1.juju-log server.go:319 osd:27: mon cluster in quorum and osds bootstrapped - providing rbd-mirror client with keys
2022-07-15 15:10:07 WARNING unit.ceph-mon-b/1.osd-relation-changed logger.go:60 Traceback (most recent call last):
2022-07-15 15:10:07 WARNING unit.ceph-mon-b/1.osd-relation-changed logger.go:60 File "/var/lib/juju/agents/unit-ceph-mon-b-1/charm/hooks/osd-relation-changed", line 1351, in <module>
2022-07-15 15:10:07 WARNING unit.ceph-mon-b/1.osd-relation-changed logger.go:60 hooks.execute(sys.argv)
2022-07-15 15:10:07 WARNING unit.ceph-mon-b/1.osd-relation-changed logger.go:60 File "/var/lib/juju/agents/unit-ceph-mon-b-1/charm/hooks/charmhelpers/core/hookenv.py", line 962, in execute
2022-07-15 15:10:07 WARNING unit.ceph-mon-b/1.osd-relation-changed logger.go:60 self._hooks[hook_name]()
2022-07-15 15:10:07 WARNING unit.ceph-mon-b/1.osd-relation-changed logger.go:60 File "/var/lib/juju/agents/unit-ceph-mon-b-1/charm/hooks/osd-relation-changed", line 886, in osd_relation
2022-07-15 15:10:07 WARNING unit.ceph-mon-b/1.osd-relation-changed logger.go:60 notify_rbd_mirrors()
2022-07-15 15:10:07 WARNING unit.ceph-mon-b/1.osd-relation-changed logger.go:60 File "/var/lib/juju/agents/unit-ceph-mon-b-1/charm/hooks/osd-relation-changed", line 596, in notify_rbd_mirrors
2022-07-15 15:10:07 WARNING unit.ceph-mon-b/1.osd-relation-changed logger.go:60 rbd_mirror_relation(relid=relid, unit=unit, recurse=False)
2022-07-15 15:10:07 WARNING unit.ceph-mon-b/1.osd-relation-changed logger.go:60 File "/var/lib/juju/agents/unit-ceph-mon-b-1/charm/hooks/osd-relation-changed", line 1019, in rbd_mirror_relation
2022-07-15 15:10:07 WARNING unit.ceph-mon-b/1.osd-relation-changed logger.go:60 'pools': json.dumps(ceph.list_pools_detail(), sort_keys=True),
2022-07-15 15:10:07 WARNING unit.ceph-mon-b/1.osd-relation-changed logger.go:60 File "/var/lib/juju/agents/unit-ceph-mon-b-1/charm/lib/charms_ceph/utils.py", line 3112, in list_pools_detail
2022-07-15 15:10:07 WARNING unit.ceph-mon-b/1.osd-relation-changed logger.go:60 'quota': get_pool_quota(pool),
2022-07-15 15:10:07 WARNING unit.ceph-mon-b/1.osd-relation-changed logger.go:60 File "/var/lib/juju/agents/unit-ceph-mon-b-1/charm/lib/charms_ceph/utils.py", line 3049, in get_pool_quota
2022-07-15 15:10:07 WARNING unit.ceph-mon-b/1.osd-relation-changed logger.go:60 output = subprocess.check_output(
2022-07-15 15:10:07 WARNING unit.ceph-mon-b/1.osd-relation-changed logger.go:60 File "/usr/lib/python3.8/subprocess.py", line 415, in check_output
2022-07-15 15:10:07 WARNING unit.ceph-mon-b/1.osd-relation-changed logger.go:60 return run(*popenargs, stdout=PIPE, timeout=timeout, check=True,
2022-07-15 15:10:07 WARNING unit.ceph-mon-b/1.osd-relation-changed logger.go:60 File "/usr/lib/python3.8/subprocess.py", line 516, in run
2022-07-15 15:10:07 WARNING unit.ceph-mon-b/1.osd-relation-changed logger.go:60 raise CalledProcessError(retcode, process.args,
2022-07-15 15:10:07 WARNING unit.ceph-mon-b/1.osd-relation-changed logger.go:60 subprocess.CalledProcessError: Command '['ceph', '--id', 'admin', 'osd', 'pool', 'get-quota', 'device_health_metrics']' returned non-zero exit status 2.
```

It appears to match other people's issues with device health metrics and the corresponding ceph module.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.