Enable napi_tx for GCP/GKE kernels
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
linux-gcp (Ubuntu) |
Invalid
|
Undecided
|
Unassigned | ||
Xenial |
Fix Released
|
Undecided
|
Unassigned | ||
Bionic |
Fix Released
|
Medium
|
Unassigned | ||
Disco |
Fix Released
|
Medium
|
Unassigned | ||
linux-gke-4.15 (Ubuntu) |
Invalid
|
Undecided
|
Unassigned | ||
Xenial |
Invalid
|
Undecided
|
Unassigned | ||
Bionic |
Fix Released
|
Undecided
|
Unassigned | ||
Disco |
Invalid
|
Undecided
|
Unassigned | ||
linux-gke-5.0 (Ubuntu) |
Invalid
|
Undecided
|
Unassigned | ||
Xenial |
Invalid
|
Undecided
|
Unassigned | ||
Bionic |
Fix Released
|
Medium
|
Unassigned | ||
Disco |
Invalid
|
Undecided
|
Unassigned |
Bug Description
Background: Napi_tx is a Linux kernel feature that makes the virtio driver call the skb destructor after the packets are actually “out” (i.e., at TX completion interrupt), as opposed to immediately after the packets are enqueued. This provides socket backpressure and is critical for features such as TSQ. Enabling napi_tx in Cloud guests is an indispensable link in the chain of end-to-end backpressure from USPS all the way up to the guest applications. It would help reduce bufferbloat, packet drops and/or avoid HoL blocking when the traffic from the VMs are rate limited (due to congestion/
The GCP networking engineering teams have asked us to include and enable napi_tx on the major guest OS's on the platform. They have 6 months of performance and regression testing and are comfortable moving forward with this broadly.
The main request is to change this module parameter:
+++ b/drivers/
@@ -26,7 +26,7 @@
static int napi_weight = NAPI_POLL_WEIGHT;
module_
-static bool csum = true, gso = true, napi_tx;
+static bool csum = true, gso = true, napi_tx = true;
That is either the above kernel change or a configuration change at module load. Note that that also gives us a simple resolution in the unlikely case that this causes a regression on some workloads.
Besides the main switch, kernels need these other prerequisite patches:
The main feature, in 4.12-rc1:
1d11e732e7d50 virtio-net: use netif_tx_napi_add for tx napi
78a57b482aa53 virtio-net: on tx, only call napi_disable if tx napi is on
bdb12e0d2ffc8 virtio-net: keep tx interrupts disabled unless kick
7b0411ef4aa69 virtio-net: clean tx descriptors from rx napi
ea7735d97ba90 virtio-net: move free_old_xmit_skbs
b92f1e6751a6a virtio-net: transmit napi
e4e8452a4ab30 virtio-net: napi helper functions
Virtio-net queue affinity, in 4.19-rc1:
2ca653d607ce5 virtio_net: Stripe queue affinities across cores.
19e226e8cc5da virtio: Make vp_set_
9af18e56d43ca cpumask: make cpumask_next_wrap available without smp
A nice to have is ethtool support to test whether the feature is enabled, in 5.1-rc1:
133bbb18ab1a2 virtio-net: per-queue RPS config
CVE References
Changed in linux-gcp (Ubuntu Bionic): | |
importance: | Undecided → Medium |
status: | New → In Progress |
Changed in linux-gke-5.0 (Ubuntu Bionic): | |
importance: | Undecided → Medium |
status: | New → In Progress |
Changed in linux-gke-5.0 (Ubuntu Disco): | |
status: | New → Invalid |
Changed in linux-gcp (Ubuntu Disco): | |
importance: | Undecided → Medium |
status: | New → In Progress |
Changed in linux-gcp (Ubuntu Xenial): | |
status: | New → In Progress |
Changed in linux-gke-5.0 (Ubuntu Xenial): | |
status: | New → Invalid |
Changed in linux-gcp (Ubuntu Xenial): | |
status: | In Progress → Fix Committed |
Changed in linux-gke-4.15 (Ubuntu Bionic): | |
status: | New → Fix Committed |
Changed in linux-gke-4.15 (Ubuntu Xenial): | |
status: | New → Invalid |
Changed in linux-gke-4.15 (Ubuntu Disco): | |
status: | New → Invalid |
Changed in linux-gke-4.15 (Ubuntu): | |
status: | New → Invalid |
Changed in linux-gcp (Ubuntu Disco): | |
status: | In Progress → Fix Committed |
This likely looks confusing but the bionic/gcp kernel (rolled to 5.0) is based on the disco/gcp kernel, so the patch(es) only need to go to disco and then come back automatically.