a failed power_state change wedges the power_state
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
Ironic |
Fix Released
|
Critical
|
Ruby Loo |
Bug Description
PUT'ing an invalid power state string to the API will wedge the node's power state. I suspect that any exception raised by a power_driver during set_node_
There may be two separate issues at play here:
1) power_state and target_power_state being changed inappropriately for an invalid input
2) no way currently exists via the API to override a non-null target_power_state, even when the current power_state is ERROR.
Logs:
########### POST a bad string. POWEROFF should be "power off".
$ curl -X POST http://
% Total % Received % Xferd Average Speed Time Time Time Current
100 186 100 186 0 0 1426 0 --:--:-- --:--:-- --:--:-- 1441
{
"error_message" : "<html>\n <head>\n <title>404 Not Found</title>\n </head>\n <body>\n <h1>404 Not Found</h1>\n The resource could not be found.<br /><br />\n\n\n\n </body>\n</html>
"
}
########### ironic-conductor log file
2013-10-09 21:55:30,135.135 17888 ERROR ironic.
2013-10-09 21:55:30,135.135 17888 TRACE ironic.
2013-10-09 21:55:30,135.135 17888 TRACE ironic.
2013-10-09 21:55:30,135.135 17888 TRACE ironic.
2013-10-09 21:55:30,135.135 17888 TRACE ironic.
172, in dispatch
2013-10-09 21:55:30,135.135 17888 TRACE ironic.
2013-10-09 21:55:30,135.135 17888 TRACE ironic.
e_node_power_state
2013-10-09 21:55:30,135.135 17888 TRACE ironic.
2013-10-09 21:55:30,135.135 17888 TRACE ironic.
rapper
2013-10-09 21:55:30,135.135 17888 TRACE ironic.
2013-10-09 21:55:30,135.135 17888 TRACE ironic.
_power_state
2013-10-09 21:55:30,135.135 17888 TRACE ironic.
2013-10-09 21:55:30,135.135 17888 TRACE ironic.
############### relevant node record at this point
mysql> select * from nodes\G
*******
target_
############# error if I try to change the state again
$ curl -X PUT -d '' http://
% Total % Received % Xferd Average Speed Time Time Time Current
100 133 100 133 0 0 1506 0 --:--:-- --:--:-- --:--:-- 1528
{
"error_message" : "{\"debuginfo\": null, \"faultcode\": \"Client\", \"faultstring\": \"One power operation is already in process\"}"
}
Changed in ironic: | |
status: | New → Triaged |
importance: | Undecided → High |
Changed in ironic: | |
assignee: | nobody → Devananda van der Veen (devananda) |
Changed in ironic: | |
assignee: | Devananda van der Veen (devananda) → nobody |
Changed in ironic: | |
assignee: | nobody → Ruby Loo (rloo) |
Changed in ironic: | |
milestone: | none → icehouse-1 |
status: | Fix Committed → Fix Released |
Changed in ironic: | |
milestone: | icehouse-1 → 2014.1 |
We discussed this in IRC and agreed, there are two distinct problems here. I'm opening a separate bug for the "API doesn't validate requested state" issue.
To clarify this bug, the issue can be seen from two perspectives.
* power_state should represent the actual power state, so if a machine is OFF, it should be OFF, not ERROR when the last state change request failed
* there's no way at present to recover a node when target_power_state != NULL, but setting this to NULL when catching an error will eat the user's data, and they'll never know that their request failed.
It was suggested to redo the state representation in a way that allows us more flexibility. Something akin to this structure:
power_state: {
'current':
'updated_at':
'requested':
'requested_at':
'error':
'error_at':
}