[RFE] Scan "maintenance" nodes to bring them back to normal

Bug #1554686 reported by xiaobin
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Ironic
Confirmed
Wishlist
Unassigned

Bug Description

Ironic nodes will be put under maintenance for some reason, for example, temporary BMC offline or unreachable.
Once nodes are put under maintenance, human has to take them out of the state manually, which is unacceptable for large deployment.

Proposal is, when run "_sync_power_states", do not exclude "maintenance" nodes, scan them at longer interval; if "do_sync_power_state" succeeds, flip the "maintenance" status.

This way "maintenance" nodes can get out of "maintenance" without human interference.

UPD from dtantsur: we need a way (probably new flag on a node) to distinguish between nodes that entered maintenance automatically and nodes that were moved into maintenance via the API.

Tags: needs-spec rfe
Revision history for this message
Haomeng,Wang (whaom) wrote :

Xiao Bin, good catch, think this need a spec for proposal, so will prepare and commit for spec review first.

Changed in ironic:
assignee: nobody → Haomeng,Wang (whaom)
Revision history for this message
Haomeng,Wang (whaom) wrote :

And looks like it is difficult to check out these maintenance status nodes which are set to maintenance due to '_sync_power_states' call, and maybe that is set by manually with cli/api, so we need to introduce new field as an indicator which indicate the maintenance status is set by '_sync_power_states' call, make sense?

Revision history for this message
xiaobin (jxiaobin) wrote :

Yes, that makes sense, thanks!

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to ironic-specs (master)

Related fix proposed to branch: master
Review: https://review.openstack.org/292190

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on ironic-specs (master)

Change abandoned by Haomeng,Wang (<email address hidden>) on branch: master
Review: https://review.openstack.org/292190

Revision history for this message
Dmitry Tantsur (divius) wrote :

On the Austin summit we agreed that it's a valuable proposal. As the current effort seems abandoned, everyone is free to pick it up. Thanks!

Changed in ironic:
status: New → Confirmed
assignee: Haomeng,Wang (whaom) → nobody
importance: Undecided → Wishlist
tags: added: rfe
description: updated
Ruby Loo (rloo)
tags: added: needs-spec
Revision history for this message
Kaifeng Wang (kaifeng) wrote :

Hi dtantsur and ruby, I am interested in this feature, I also noticed this one is somewhat duplicated with rfe https://bugs.launchpad.net/ironic/+bug/1596107, which seems providing a more detailed solution. Which should I start working with?

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.