nova service-delete fails for services on non-child (top) cell
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
OpenStack Compute (nova) |
Fix Released
|
Medium
|
Sean Dague | ||
Juno |
Fix Released
|
Undecided
|
Unassigned |
Bug Description
Nova service-delete fails for services on non-child (top) cell.
How to reproduce:
$ nova --os-username admin service-list
+------
| Id | Binary | Host | Zone | Status | State | Updated_at | Disabled Reason |
+------
| region!child@1 | nova-conductor | region!child@ubuntu | internal | enabled | up | 2014-08-
| region!child@2 | nova-compute | region!child@ubuntu | nova | enabled | up | 2014-08-
| region!child@3 | nova-cells | region!child@ubuntu | internal | enabled | up | 2014-08-
| region!child@4 | nova-scheduler | region!child@ubuntu | internal | enabled | up | 2014-08-
| region@1 | nova-cells | region@ubuntu | internal | enabled | up | 2014-08-
| region@2 | nova-cert | region@ubuntu | internal | enabled | up | 2014-08-
| region@3 | nova-consoleauth | region@ubuntu | internal | enabled | up | 2014-08-
+------
Stop one of the services on top cell (e.g. nova-cert).
$ nova --os-username admin service-list
+------
| Id | Binary | Host | Zone | Status | State | Updated_at | Disabled Reason |
+------
| region!child@1 | nova-conductor | region!child@ubuntu | internal | enabled | up | 2014-08-
| region!child@2 | nova-compute | region!child@ubuntu | nova | enabled | up | 2014-08-
| region!child@3 | nova-cells | region!child@ubuntu | internal | enabled | up | 2014-08-
| region!child@4 | nova-scheduler | region!child@ubuntu | internal | enabled | up | 2014-08-
| region@1 | nova-cells | region@ubuntu | internal | enabled | up | 2014-08-
| region@2 | nova-cert | region@ubuntu | internal | enabled | down | 2014-08-
| region@3 | nova-consoleauth | region@ubuntu | internal | enabled | up | 2014-08-
+------
Nova service-delete:
$ nova --os-username admin service-delete 'region@2'
Check the request id from nova-api.log:
2014-08-18 15:10:23.491 INFO nova.osapi_
Error log in n-cell-region service:
2014-08-18 15:10:23.464 ERROR nova.cells.
2014-08-18 15:10:23.464 TRACE nova.cells.
2014-08-18 15:10:23.464 TRACE nova.cells.
2014-08-18 15:10:23.464 TRACE nova.cells.
2014-08-18 15:10:23.464 TRACE nova.cells.
2014-08-18 15:10:23.464 TRACE nova.cells.
2014-08-18 15:10:23.464 TRACE nova.cells.
Appendix:
In case of services on child cell, no issues.
$ nova --os-username admin service-list
+------
| Id | Binary | Host | Zone | Status | State | Updated_at | Disabled Reason |
+------
| region!child@1 | nova-conductor | region!child@ubuntu | internal | enabled | up | 2014-08-
| region!child@2 | nova-compute | region!child@ubuntu | nova | enabled | down | 2014-08-
| region!child@3 | nova-cells | region!child@ubuntu | internal | enabled | up | 2014-08-
| region!child@4 | nova-scheduler | region!child@ubuntu | internal | enabled | up | 2014-08-
| region@1 | nova-cells | region@ubuntu | internal | enabled | up | 2014-08-
| region@2 | nova-cert | region@ubuntu | internal | enabled | down | 2014-08-
| region@3 | nova-consoleauth | region@ubuntu | internal | enabled | up | 2014-08-
+------
Delete child cell service:
$ nova --os-username admin service-delete 'region!child@2'
$ nova --os-username admin service-list
+------
| Id | Binary | Host | Zone | Status | State | Updated_at | Disabled Reason |
+------
| region!child@1 | nova-conductor | region!child@ubuntu | internal | enabled | up | 2014-08-
| region!child@3 | nova-cells | region!child@ubuntu | internal | enabled | up | 2014-08-
| region!child@4 | nova-scheduler | region!child@ubuntu | internal | enabled | up | 2014-08-
| region@1 | nova-cells | region@ubuntu | internal | enabled | up | 2014-08-
| region@2 | nova-cert | region@ubuntu | internal | enabled | down | 2014-08-
| region@3 | nova-consoleauth | region@ubuntu | internal | enabled | up | 2014-08-
+------
Changed in nova: | |
assignee: | nobody → Rajesh Tailor (rajesh-tailor) |
Changed in nova: | |
assignee: | RedBaron (dheeraj-gupta4) → Sean Dague (sdague) |
Changed in nova: | |
milestone: | none → kilo-2 |
status: | Fix Committed → Fix Released |
Changed in nova: | |
milestone: | kilo-2 → 2015.1.0 |
Changed in nova: | |
importance: | Undecided → Medium |
To my mind, this bug occurs because both cases cells_api. HostAPI method
1. When a service is to be deleted at the top level
2. Deletion has to be initiated at the top level (A nova service-delete command for instance)
are handled by the same function in the nova.compute.
def service_ delete( self, context, service_id):
self.cells_ rpcapi. service_ delete( context, service_id)
"""Deletes the specified service."""
The method does the correct thing when the command is executed/initiated - Strips the cellname and service name and forwards the message to the concerned cell. The target cell on receiving the message calls appropriate method in nova.cells. messaging. _TargetedMessag eMethods
def service_ delete( self, message, service_id):
self.host_ api.service_ delete( message. ctxt, service_id)
"""Deletes the specified service."""
For child cells this is fine since self.host_api points to nova.compute. HostAPI but for an API cell the self.host_api is nova.compute. cells_api. HostAPI which initiated the message in the first place. The top level cell thus has no way of knowing that the user wants to delete a service on _this_ cell and forwards the request (like it did in the beginning).
As an example,If the service ID to delete is 'toplevel@4' i.e. in API cell, the API cell splits out the cell and service using cells.utils function and gets cell_name:toplevel, service_id:4. It then forwards the message to itself. On receiving this message it only gets service_id:4 and (again), tries to split it for forwarding. Now spliting returns a cell_name: None and hence the eroor occurs while trying to route the message.
The same problem exists in service_update method too.