vhost guest network randomly drops under stress (kvm)
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
The Ubuntu-power-systems project |
Fix Released
|
High
|
Canonical Kernel Team | ||
linux (Ubuntu) |
Fix Released
|
High
|
Joseph Salisbury | ||
Zesty |
Fix Released
|
High
|
Joseph Salisbury |
Bug Description
== SRU Justification ==
A vhost performance patch was introduced in the 4.10 kernel upstream, and is currently included in the Zesty 4.10 kernel:
commit 809ecb9bca6a942
Author: Jason Wang <email address hidden>
Date: Mon Dec 12 14:46:49 2016 +0800
vhost: cache used event for better performance
--
However I recently hit a functional issue linked to this patch which would cause random guests to lose their network connection under stress. This is not architecture specific and more likely to be hit with high network stress (i.e. lots of uperf instances).
The patch author has now reverted this patch upstream:
https:/
which reads:
"
Revert "vhost: cache used event for better performance"
This reverts commit 809ecb9bca6a942
was reported to break vhost_net. We want to cache used event and use
it to check for notification. The assumption was that guest won't move
the event idx back, but this could happen in fact when 16 bit index
wraps around after 64K entries.
Signed-off-by: Jason Wang <email address hidden>
Acked-by: Michael S. Tsirkin <email address hidden>
Signed-off-by: David S. Miller <email address hidden>
"
I am requesting this patch to revert the problematic one be pulled into Ubuntu Zesty (anything 4.10+).
---uname output---
Linux p82qvirt 4.10.0-32-generic #36~16.04.1-Ubuntu SMP Wed Aug 9 09:19:19 UTC 2017 ppc64le ppc64le ppc64le GNU/Linux
Machine Type = 8247-22L
---Steps to Reproduce---
I can recreate the scenario with the following setup:
- on a 20core host, start 20 1core VMs
- I have a single linux bridge assigned to all guests using virtio
- start a uperf benchmark between each guest pair (10 total) using a high number of uperf nprocs (32)
CVE References
tags: | added: architecture-ppc64le bugnameltc-157775 severity-high targetmilestone-inin16043 |
Changed in ubuntu: | |
assignee: | nobody → Ubuntu on IBM Power Systems Bug Triage (ubuntu-power-triage) |
affects: | ubuntu → linux (Ubuntu) |
Changed in ubuntu-power-systems: | |
assignee: | nobody → Canonical Kernel Team (canonical-kernel-team) |
importance: | Undecided → High |
tags: | added: kernel-da-key |
Changed in linux (Ubuntu): | |
status: | New → Triaged |
importance: | Undecided → High |
Changed in ubuntu-power-systems: | |
status: | New → Triaged |
Changed in ubuntu-power-systems: | |
status: | Triaged → In Progress |
Changed in linux (Ubuntu Zesty): | |
importance: | Undecided → High |
status: | New → In Progress |
description: | updated |
Changed in linux (Ubuntu Zesty): | |
assignee: | nobody → Joseph Salisbury (jsalisbury) |
Changed in linux (Ubuntu Zesty): | |
status: | In Progress → Fix Committed |
Changed in ubuntu-power-systems: | |
status: | In Progress → Fix Committed |
Changed in ubuntu-power-systems: | |
status: | Fix Committed → Fix Released |
tags: |
added: targetmilestone-inin1704 removed: targetmilestone-inin16043 |
I built a Zesty test kernel with a revert of commit 809ecb9bca. The test kernel can be downloaded from:
http:// kernel. ubuntu. com/~jsalisbury /lp1711251
Can you test this kernel and see if it resolves this bug?
Thanks in advance!