Pausing hacluster subordinate unit should remove nrpe checks on the principal
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
OpenStack Keystone Charm |
New
|
Undecided
|
Gabriel Cocenza |
Bug Description
Bug 1880576 makes a hacluster unit not alert when it is paused (check_crm lists the unit in standby, but since it is paused, no alert needs to be triggered).
If eg. 3 keystone units are deployed, and hacluster and nrpe are set as subordinate units, no alerting should happen on units where hacluster is paused.
Cases:
a) hacluster/1 is subordinate of keystone/1, and hacluster/1 is paused
a.1) check_haproxy nrpe check (set by the principal unit, keystone in this case): haproxy.service systemd unit is stopped by Pacemaker
a.2) check_haproxy_
b) keystone/1 (the principal unit of the subordinate that was paused) is also paused,
b.1) apache2.service and memcached.service in keystone/1 will also alert
b.2) check_haproxy_
The mentioned issues are related to charm-keystone and not charm-hacluster, since the nrpe checks that alert are configured by charm-keystone.
The expectation should be that whenever a subordinate hacluster unit is paused, the principal should also be paused. And by pausing the principal, no alerts configured by the principal units should alert:
a) On the peer units, keystone/0 and keystone/2 should remove keystone/1 IP from the haproxy.cfg backend configuration (there is no option of monitoring a health check because apache2 will efectively go down in the paused unit)
b) On the paused unit (keystone/1), nrpe check for apache2, memcached, haproxy and haproxy_servers should be removed (if the nrpe relation is present)
Changed in charm-keystone: | |
assignee: | nobody → Gabriel Angelo Sgarbi Cocenza (gabrielcocenza) |
This issue probably affects all principal charms that provide OpenStack API services.