Deletion of instances will be stuck forever if any of deletion hung in 'multipath -r'
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
OpenStack Compute (nova) |
Confirmed
|
Low
|
Unassigned |
Bug Description
I created about 25 VMs from bootable volumes, after finishing this,
I ran a script to deletion all of them in a very short time.
while what i saw was: all of the VMs were in 'deleting' status and would never be deleted after waiting for hours
from ps cmd:
stack@ubuntu-
root 8205 0.0 0.0 504988 5560 ? SLl Apr22 0:01 /sbin/multipathd
root 115515 0.0 0.0 64968 2144 pts/3 S+ Apr22 0:00 sudo nova-rootwrap /etc/nova/
root 115516 0.0 0.0 42240 9488 pts/3 S+ Apr22 0:00 /usr/bin/python /usr/local/
root 115525 0.0 0.0 41792 2592 pts/3 S+ Apr22 0:00 /sbin/multipath -r
stack 151825 0.0 0.0 11744 936 pts/0 S+ 02:10 0:00 grep --color=auto multipath
then i killed the multipath -r commands
all vm ran into ERROR status
after digging into nova code,
nova always trying to get a global file lock :
@utils.
def disconnect_
"""Detach the volume from instance_name."""
......
if self.use_multipath and multipath_device:
return self._disconnec
and then rescan iscsi by 'multipath -r'
def _disconnect_
In my case, 'multipath -r' hang for a very long time and did not exit for serveral hours
in addtion, this block all deletion of VM instances in the same Nova Node
IMO, Nova should not wait the "BLOCK" command forever, at least, a timeout is needed for command such as'multipath -r' and 'multipath -ll'
or is there any other solution for my case?
MY ENVIRONMENT:
Ubuntu Server 14:
multipath-tools
multipath enabled in Nova node
Thanks
Peter
information type: | Public → Public Security |
information type: | Public Security → Public |
tags: | added: volumes |
Changed in nova: | |
status: | New → Confirmed |
importance: | Undecided → Low |
tags: | added: multipath |
I am pushing commit into oslo.concurrency to support timeout ( https:/ /review. openstack. org/#/c/ 177030/ )
once this will be merged and release we can use it to put a timeout on the multipath command