IO stuck causes nova compute agent outage
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
OpenStack Compute (nova) |
Confirmed
|
Undecided
|
Unassigned |
Bug Description
Description:
============
Due to overload situation in our storage one NFS mount stucked.
All other mount points where accessible and working.
Deletion of a VM on this hypervisor was not possible since nova-compute wasn't reactive.
The agent was flagged as:
> nova-manage service list
nova-compute de4-2e-ff-0d-44-a4 nova enabled XXX 2017-05-16 11:49:00.577943
The nova-compute services scans over all attached volume paths (ephemeral and cinder).
In case of a single stale NFS mount will pause the whole agent.
With an inactive agent no operation are possible, even VM deletion.
Steps to reproduce:
===================
1.) Boot a VM
2.) Attach a volume
3.) Make the NFS backend inaccessible (e.g. using a drop iptable rule)
summary: |
- NFS stale causes nova compute agent outage + IO stuck causes nova compute agent outage |
description: | updated |
Changed in nova: | |
status: | New → Confirmed |
Fix proposed to branch: master /review. openstack. org/465653
Review: https:/