Intel i40e PF reset under load
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
linux (Ubuntu) |
Confirmed
|
Undecided
|
Jay Vosburgh | ||
Xenial |
Fix Released
|
Undecided
|
Unassigned |
Bug Description
SRU Justification:
Impact:
Using an Intel i40e network device, under heavy traffic load with
TSO enabled, the device will spontaneously reset itself and issue errors
similar to the following:
Jun 14 14:09:51 hostname kernel: [4253913.851053] i40e 0000:05:00.1: TX driver issue detected, PF reset issued
Jun 14 14:09:53 hostname kernel: [4253915.476283] i40e 0000:05:00.1: TX driver issue detected, PF reset issued
Jun 14 14:09:54 hostname kernel: [4253917.411264] i40e 0000:05:00.1: TX driver issue detected, PF reset issued
This causes a full reset of the PF, which causes an interruption
in traffic flow.
In this case, these errors arise from a bug in the i40e device
driver introduced by commit:
commit 584a837e26408c6
Author: Alexander Duyck <email address hidden>
Date: Wed Feb 17 11:02:50 2016 -0800
i40e/i40evf: Rewrite logic for 8 descriptor per packet check
This patch was added to the Xenial kernel beginning with version
4.4.0-8.23. This bug does not manifest on any other Ubuntu kernel series.
Fix:
This error is resolved upstream by:
commit 3f3f7cb875c0f62
Author: Alexander Duyck <email address hidden>
Date: Wed Mar 30 16:15:37 2016 -0700
i40e/i40evf: Limit TSO to 7 descriptors for payload instead of 8 per packet
This fix was never backported into the Xenial 4.4 kernel series.
Testcase:
In this case, the issue occurs at a customer site using i40e based
Intel network cards with SR-IOV enabled. Under heavy load, the card will
reset itself as described. The customer has tested the 3f3f7cb875c patch
in their environment and confirmed that it resolves the issue.
CVE References
Changed in linux (Ubuntu): | |
assignee: | nobody → Jay Vosburgh (jvosburgh) |
Changed in linux (Ubuntu): | |
status: | Incomplete → Confirmed |
Changed in linux (Ubuntu Xenial): | |
status: | Confirmed → Fix Committed |
tags: |
added: verification-done-xenial removed: verification-needed-xenial |
Changed in linux (Ubuntu Xenial): | |
milestone: | none → ubuntu-16.04.4 |
milestone: | ubuntu-16.04.4 → none |
This bug is missing log files that will aid in diagnosing the problem. While running an Ubuntu kernel (not a mainline or third-party kernel) please enter the following command in a terminal window:
apport-collect 1700834
and then change the status of the bug to 'Confirmed'.
If, due to the nature of the issue you have encountered, you are unable to run this command, please add a comment stating that fact and change the bug status to 'Confirmed'.
This change has been made by an automated script, maintained by the Ubuntu Kernel Team.