QEMU - count cache flush Spectre v2 mitigation (CVE) (required for POWER9 DD2.3)
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
The Ubuntu-power-systems project |
Fix Released
|
High
|
Unassigned | ||
linux (Ubuntu) |
Fix Released
|
Undecided
|
Unassigned | ||
Bionic |
Fix Released
|
Undecided
|
Unassigned | ||
Disco |
Fix Released
|
High
|
Unassigned | ||
qemu (Ubuntu) |
Fix Released
|
Undecided
|
Unassigned | ||
Xenial |
Won't Fix
|
Undecided
|
Unassigned | ||
Bionic |
Fix Released
|
High
|
Canonical Server | ||
Cosmic |
Won't Fix
|
Undecided
|
Unassigned | ||
Disco |
Fix Released
|
High
|
Canonical Server | ||
Eoan |
Fix Released
|
Undecided
|
Unassigned |
Bug Description
[Impact]
* This belongs to the overall context of spectre mitigations and even
more the try to minimize the related performance impacts.
On ppc64el there is a new chip revision (DD 2.3) which provides
a facility that helps to better mitigate some of this.
* Backport the patches that will make the feature (if supported by the
HW) will pass the capability to the guest - to allow guests that
support the improved mitigation to use it.
[Test Case]
* Start guests with and without this capability
* Check if the capability is guest visible as intented
* Check if there are any issues on pre DD2.3 HW
* Test migrations (IBM outlined the intented paths that will work
below)
* The problem with the above (and also the reasons I didn't add a list
of commands this time) is that it needs special HW (mentioned DD2.3
revision) of the chips which aren't available to us right now.
Due to that testing / verification of this on all releases is on IBM
[Regression Potential]
* Adding new capabilities usually works fine, there are three common
pitfalls which here are the regression potential.
- (severe) the code would announce a capability that isn't really
available. The guest tries to use it and crashes
- (medium) several migration paths especially from systems with the
new cap to older (un-updated systems) will fail. But that applies
to any "from machine with Feature to machine without that feature"
and isn't really a new regression. As outlined by IBM below they
even tried to make it somewhat compatible (by being a new value in
an existing cap)
- (low) the guest will see new caps and or facilities. A really odd
guest could stumble due to that (would actually be a guest bug
then)
Overall all of the above was considered by IBM when developing this
and should be ok. For archive wide SRU considerations, this has NO
effect on non ppc64el.
[Other Info]
* n/a
---
Power9 DD 2.3 CPUs running updated firmware will use a new Spectre v2 mitigation. The new mitigation improves performance of branch heavy workloads, but also requires kernel support in order to be fully secure.
Without the kernel support there is a risk of a Spectre v2 attack across a process context switch, though it has not been demonstrated in practice.
QEMU portion - platform definition needs to account for this new mitigation action.. so attribute for this needs to be added.
In terms of support for virtualisation there are 2 sides, kvm and qemu support. Patch list for each,
KVM:
2b57ecd0208f KVM: PPC: Book3S: Add count cache flush parameters to kvmppc_
This is part of LP1822870 already.
QEMU:
8ff43ee404 target/ppc/spapr: Add SPAPR_CAP_
399b2896d4 target/ppc/spapr: Add workaround option to SPAPR_CAP_IBS
The KVM side is upstream as of v5.1-rc1.
The QEMU side is upstream as of v4.0.0-rc0.
In terms of migration the state is as follows.
In order to specify to the guest to use the count cache flush workaround we use the spapr-cap cap-ibs (indirect branch speculation) with the value workaround. Previously the only valid values were broken, fixed-ibs (indirect branch serialisation) and fixed-ccd (count cache disabled). And add a new cap cap-ccf-assist (count cache flush assist) to specify the availability of the hardware assisted flush variant.
Note the the way spapr caps work you can migrate to a host that supports a higher value, but not to one which doesn't support the current value (i.e. only supports lower values). Where for cap-ibs these are defined as:
0 - Broken
1 - Workaround
2 - fixed-ibs
3 - fixed-ccd
So the following migrations would be valid for example:
broken -> fixed-ccd, broken -> workaround, workaround -> fixed-ccd
While the following would be invalid:
fixed-ccd -> workaround, workaround ->broken, fixed-ccd -> broken
This is done to maintain at least the level of protection specified on the command line on migration.
Since the workaround must be communicated to the guest kernel at boot we cannot migrate a guest from a host with fixed-ccd to one with workaround since the guest wouldn't know to do the flush and so would be wholly unprotected.
This means that to migrate a guest from 2.2 and before to 2.3 would require the guest to either be have been booted with broken previously, or to be rebooted with workaround specified on the command line which would allow the migration to succeed to a 2.3.
== MICHAEL D. ROTH ==
I've tested a backport of count-cache-flush support consisting of the following patches applied (cleanly) on top of bionic's QEMU 2.11+dfsg-
target/ppc/spapr: Add SPAPR_CAP_
ppc/spapr-caps: Change migration macro to take full spapr-cap name
target/ppc/spapr: Add workaround option to SPAPR_CAP_IBS
target/ppc: Factor out the parsing in kvmppc_
The following tests were done using a DD 2.3 Witherspoon machine and the results seem to align with what's expected in the original summary:
== enablement tests (using 4.15.0-51-generic in both host and guests) ==
with cap-ibs=
mdroth@ubuntu:~$ dmesg | grep cache-flush
[ 0.000000] count-cache-flush: hardware assisted flush sequence enabled
with cap-ibs=
mdroth@ubuntu:~$ dmesg | grep cache-flush
[ 0.000000] count-cache-flush: full software flush sequence enabled.
with cap-ibs=broken
mdroth@ubuntu:~$ dmesg | grep cache-flush
[ 0.000000] count-cache-flush: software flush disabled.
== migration tests (using 4.15.0-51-generic in both host and guests) ==
Note that pseries-
smc-
smc-
smc-
but SPAPR_CAP_FIXED_CCD is not available on the DD 2.3 system I tested on (no fw-count-
cross-migration: qemu 2.11+dfsg-
source: -M bionic-
target: -M bionic-
expected: warning
actual: warning
"cap-ibs lower level (0) in incoming stream than on destination (1))"
software ccf enabled after reboot? yes
target: -M bionic-
expected: warning
actual: warning
"
hardware ccf enabled after reboot? yes
target: -M bionic-
expected: success
actual: success
migration: 2.11+dfsg-
source: -M bionic-
target: -M bionic-
expected: success
actual: success
target: -M bionic-
expected: warning
actual: warning
"
hardware ccf enabled after reboot? yes
target: -M bionic-
expected: fail
actual: fail
"cap-ibs higher level (1) in incoming stream than on destination (0)"
source: -M bionic-
target: -M bionic-
expected: success
actual: success
target: -M bionic-
expected: fail
actual: fail, "cap-ccf-assist higher level (1) in incoming stream than on destination (0)"
target: cap-ibs=broken (expected: fail, actual: )
expected: fail
actual: fail
"cap-ibs higher level (1) in incoming stream than on destination (0)"
"
Sorry, I forgot that I needed some fix-ups for the 4th/last patch, "target/ppc/spapr: Add SPAPR_CAP_
I've gone ahead and posted my git tree, which is based on top of the qemu_2.
https:/
Related branches
- Rafael David Tinoco (community): Approve
- Canonical Server packageset reviewers: Pending requested
- git-ubuntu developers: Pending requested
-
Diff: 629 lines (+589/-0)6 files modifieddebian/changelog (+7/-0)
debian/patches/series (+4/-0)
debian/patches/ubuntu/lp-1832622-0001-target-ppc-Factor-out-the-parsing-in-kvmppc_get_cpu_.patch (+101/-0)
debian/patches/ubuntu/lp-1832622-0002-target-ppc-spapr-Add-workaround-option-to-SPAPR_CAP_.patch (+159/-0)
debian/patches/ubuntu/lp-1832622-0003-ppc-spapr-caps-Change-migration-macro-to-take-full-s.patch (+79/-0)
debian/patches/ubuntu/lp-1832622-0004-target-ppc-spapr-Add-SPAPR_CAP_CCF_ASSIST.patch (+239/-0)
- Rafael David Tinoco (community): Approve
- Canonical Server: Pending requested
- git-ubuntu developers: Pending requested
-
Diff: 528 lines (+494/-0)5 files modifieddebian/changelog (+7/-0)
debian/patches/series (+3/-0)
debian/patches/ubuntu/lp-1832622-0001-target-ppc-Factor-out-the-parsing-in-kvmppc_get_cpu_.patch (+101/-0)
debian/patches/ubuntu/lp-1832622-0002-target-ppc-spapr-Add-workaround-option-to-SPAPR_CAP_.patch (+159/-0)
debian/patches/ubuntu/lp-1832622-0004-target-ppc-spapr-Add-SPAPR_CAP_CCF_ASSIST.patch (+224/-0)
- Rafael David Tinoco (community): Approve
- Canonical Server packageset reviewers: Pending requested
- git-ubuntu developers: Pending requested
-
Diff: 412 lines (+384/-0)4 files modifieddebian/changelog (+7/-0)
debian/patches/series (+2/-0)
debian/patches/ubuntu/lp-1832622-0002-target-ppc-spapr-Add-workaround-option-to-SPAPR_CAP_.patch (+159/-0)
debian/patches/ubuntu/lp-1832622-0004-target-ppc-spapr-Add-SPAPR_CAP_CCF_ASSIST.patch (+216/-0)
- Rafael David Tinoco (community): Approve
- Canonical Server: Pending requested
- Canonical Server packageset reviewers: Pending requested
- git-ubuntu developers: Pending requested
-
Diff: 412 lines (+384/-0)4 files modifieddebian/changelog (+7/-0)
debian/patches/series (+2/-0)
debian/patches/ubuntu/lp-1832622-0002-target-ppc-spapr-Add-workaround-option-to-SPAPR_CAP_.patch (+159/-0)
debian/patches/ubuntu/lp-1832622-0004-target-ppc-spapr-Add-SPAPR_CAP_CCF_ASSIST.patch (+216/-0)
CVE References
tags: | added: architecture-ppc64le bugnameltc-176932 severity-critical targetmilestone-inin18041 |
Changed in ubuntu: | |
assignee: | nobody → Ubuntu on IBM Power Systems Bug Triage (ubuntu-power-triage) |
affects: | ubuntu → qemu (Ubuntu) |
Changed in ubuntu-power-systems: | |
importance: | Undecided → Critical |
assignee: | nobody → Canonical Server Team (canonical-server) |
description: | updated |
Changed in ubuntu-power-systems: | |
status: | New → Triaged |
tags: | added: qemu-19.10 |
Changed in ubuntu-power-systems: | |
status: | Triaged → In Progress |
Changed in qemu (Ubuntu Eoan): | |
assignee: | Ubuntu on IBM Power Systems Bug Triage (ubuntu-power-triage) → Canonical Server Team (canonical-server) |
Changed in qemu (Ubuntu Disco): | |
assignee: | nobody → Canonical Server Team (canonical-server) |
Changed in qemu (Ubuntu Cosmic): | |
assignee: | nobody → Canonical Server Team (canonical-server) |
Changed in qemu (Ubuntu Bionic): | |
assignee: | nobody → Canonical Server Team (canonical-server) |
Changed in ubuntu-power-systems: | |
importance: | Critical → Medium |
assignee: | Canonical Server Team (canonical-server) → nobody |
Changed in ubuntu-power-systems: | |
status: | In Progress → Fix Committed |
Changed in linux (Ubuntu Disco): | |
status: | New → Confirmed |
importance: | Undecided → High |
no longer affects: | linux (Ubuntu Cosmic) |
no longer affects: | linux (Ubuntu Eoan) |
no longer affects: | linux (Ubuntu Xenial) |
Changed in linux (Ubuntu): | |
status: | New → Fix Released |
Changed in linux (Ubuntu Bionic): | |
status: | New → Fix Released |
Changed in ubuntu-power-systems: | |
status: | Fix Committed → Confirmed |
Changed in linux (Ubuntu Disco): | |
status: | Confirmed → In Progress |
Changed in linux (Ubuntu Disco): | |
status: | In Progress → Fix Committed |
Changed in ubuntu-power-systems: | |
status: | Confirmed → Fix Committed |
Changed in ubuntu-power-systems: | |
status: | Fix Committed → Fix Released |
I'm glad that the kernel patch is already integrated by bug 1822870 in >=Bionic - no dependency on the kernel here then.
The patches themselve look small and clean. get_cpu_ characteristics ()
Thanks for identifying the extra dependencies to:
- 8fea7044 (>=3.0) target/ppc: Factor out the parsing in kvmppc_
- 8c5909c4 (>=2.12) ppc/spapr-caps: Change migration macro to take full spapr-cap name
That overall makes the request to apply: get_cpu_ characteristics () CCF_ASSIST
- 8c5909c4 (>=2.12) ppc/spapr-caps: Change migration macro to take full spapr-cap name
- 8fea7044 (>=3.0) target/ppc: Factor out the parsing in kvmppc_
- 399b2896 (>=4.0) target/ppc/spapr: Add workaround option to SPAPR_CAP_IBS
- 8ff43ee4 (>=4.0) target/ppc/spapr: Add SPAPR_CAP_
By reading the bug top down I ran into issues with patch #4, but then I read the rest and found that you already handled that. Taking the backport from the referenced git worked great, thanks Michael.
There was some minor noise bringing that to 2.12 and 3.0 but it worked rather straight forward as expected for 2.12. In qemu 3.0 thou we need something else for the fourth patch. Neither the upstream original (9 rejects), nor the backport you provided for 2.11 apply (10 rejects).
Upstream is a bit closer, the lack of "large decr" in qemu 3.0 shows up as context change a few times, but those were resovable.
For "SPAPR_ CAP_CCF_ ASSIST" I followed your backport of leaving no holes in the cap numbering (the alternative would be to retain it being 0x9, but leave some in between undefined which would break when iterating).
TODO hw/ppc/ spapr.h SPAPR_CAP_ CCF_ASSIST for wholes
check cosmic applied include/
IIRC Xenial has no P9 support and probably would be much harder to backport, so unless further discussion this is a Won't Fix for Xenial.
Timing: we have a qemu SRU in the pipe that needs verification and release. Once done we will enqueue that one.
But until then we can still work on this. Cosmic/ Disco/Eoan (linked to the bug here) and a PPA [1].
I opend MPs for internal review with the backports for Bionic/
If you want to test anything ahead of proposed please feel free to take a look at MPs and/or the PPA.
[1]: https:/ /launchpad. net/~paelzer/ +archive/ ubuntu/ bug-1832622- qemu-spectre- ppc