Add checks for baremetal node health for ironic
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
charm-openstack-service-checks |
New
|
Undecided
|
Unassigned |
Bug Description
Openstack Ironic "baremetal nodes" should be monitored for nodes in "Maintenance=True" state as well as provisioning_
For instance, all nodes should have provisioning state of one of the following:
active
available
managable (this should probably provoke a warning state, as the machine is not consumable by the cloud users)
cleaning
*wait* (such as clean wait, callback wait, etc)
If the status is "error" or "cleaning failed" or "managable" we should set an alertable state.
Also, if Maintenance = True, the machine is not available for cloud user consumption, so it should also set an alertable state.
The command to query is "openstack baremetal node list", and should have checks added if the openstack endpoint list includes a service with service_name=ironic or service_
It might be nice for there to be two checks, one for maintenance mode which can be silenced while still alerting on baremetal nodes that go into 'error' or 'clean failed' for provisioning_state.