ceph-mon should report error when cluster is not in "Health_OK" state
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
Ceph Monitor Charm |
In Progress
|
Medium
|
Chris MacNaughton |
Bug Description
Hope this bug fits into ceph-mon space.
Currently when the ceph components are deployed - all ceph-relared units report "ready" status.
In case there is an issue with OSDs joining/PG replication etc - there is information passed into "juju status" output, as well as those using interface ceph-client know that the cluster in in unhealthy state and try to connect, create pools etc. This leads to the PGs getting stuck eventually and the pools require re-creation.
ubuntu@ubuntu:~$ juju status|grep ceph
k8s-ceph-3 controller1 cloud1 2.4.3 unsupported 23:57:01Z
ceph-mon 10.2.10 active 3 ceph-mon jujucharms 27 ubuntu
ceph-osd 10.2.10 active 3 ceph-osd jujucharms 270 ubuntu
ceph-radosgw 10.2.10 active 1 ceph-radosgw jujucharms 260 ubuntu
ceph-mon/0 active idle 0/lxd/0 192.168.122.175 Unit is ready and clustered
ceph-mon/1 active idle 1/lxd/0 192.168.122.180 Unit is ready and clustered
ceph-mon/2* active idle 2/lxd/0 192.168.122.183 Unit is ready and clustered
ceph-osd/0 active idle 0 172.16.0.2 Unit is ready (2 OSD)
ceph-osd/1* active idle 1 172.16.0.3 Unit is ready (2 OSD)
ceph-osd/2 active idle 2 172.16.0.4 Unit is ready (2 OSD)
ceph-radosgw/0* active idle 0/lxd/1 192.168.122.186 80/tcp Unit is ready
ubuntu@ubuntu:~$ juju run --unit ceph-mon/0 'sudo ceph -s'
cluster 5370e132-
health HEALTH_ERR
148 pgs are stuck inactive for more than 300 seconds
148 pgs peering
148 pgs stuck inactive
148 pgs stuck unclean
200 requests are blocked > 32 sec
monmap e2: 3 mons at {juju-76a6af-
osdmap e68: 6 osds: 4 up, 4 in; 148 remapped pgs
flags sortbitwise,
pgmap v34649: 148 pgs, 19 pools, 1588 bytes data, 171 objects
169 MB used, 36650 MB / 36819 MB avail
Changed in charm-ceph-mon: | |
assignee: | nobody → Chris MacNaughton (chris.macnaughton) |
status: | New → In Progress |
importance: | Undecided → Medium |
Fix proposed to branch: master /review. openstack. org/629843
Review: https:/