Signal 7 error when running GPFS tracing in cluster
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
The Ubuntu-power-systems project |
Fix Released
|
Critical
|
Canonical Kernel Team | ||
linux (Ubuntu) |
Fix Released
|
Critical
|
Joseph Salisbury | ||
Bionic |
Fix Released
|
Critical
|
Joseph Salisbury | ||
Cosmic |
Fix Released
|
Critical
|
Joseph Salisbury |
Bug Description
== SRU Justification ==
IBM is requesting these commits in bionic and cosmic. These commits
also rely on commit 7acf50e4efa6, which was SRU'd in bug 1792102.
Description of bug:
GPFS mmfsd daemon is mapping shared tracing buffer(allocated from kernel
driver using vmalloc) and then writing trace records from user space threads
in parallel. While the SIGBUS happened, the access virtual memory address
is in the mapped range, no overflow on access.
The root cause is that for PTEs created by a driver at mmap time (ie, that
aren't created dynamically at fault time), it's not legit for ptep_set_
to make them invalid even temporarily. A concurrent access while they are
invalid will be unable to service the page fault and will cause as SIGBUS.
== Fixes ==
bd0dbb73e013 ("powerpc/
f08d08f3db55 ("powerpc/mm/radix: Only need the Nest MMU workaround for R -> RW transition")
== Regression Potential ==
Low. Limited to powerpc.
== Test Case ==
A test kernel was built with these patches and tested by IBM.
IBM states the test kernel resolved the bug.
-- Problem Description --
GPFS mmfsd daemon is mapping shared tracing buffer(allocated from kernel driver using vmalloc) and then writing trace records from user space threads in parallel. While the SIGBUS happened, the access virtual memory address is in the mapped range, no overflow on access.
Worked with Benjamin Herrenschmidt on GPFS tracing kernel driver code and he made a suggestion as workaround on the driver code to bypass the problem, and it works....
the workaround code change as below:
- rc = remap_pfn_
+ rc = remap_pfn_
As Benjamin mentioned, this is a Linux kernel bug and this is just a workaround. He will give the details about the kernel bug and why this workaround works....
The root cause is that for PTEs created by a driver at mmap time (ie, that aren't created dynamically at fault time), it's not legit for ptep_set_
Thankfully such PTEs shouldn't normally be the subject of a RO->RW privilege escalation.
What happens is that the GPFS driver creates the PTEs using remap_pfn_
PAGE_SHARED has _PAGE_ACCESSED (R) but not _PAGE_DIRTY (C) set.
Thus on the first write, we try set C and while doing so, hit the above workaround, which causes the problem described earlier.
The proposed patch will ensure we only do the Nest MMU hack when changing _PAGE_RW and not for normal R/C updates.
The workaround tested by the GPFS team consists of adding _PAGE_DIRTY to the mapping created by remap_pfn_range() to avoid the RC update fault completely.
This is fixed by these:
https:/
Since DD1 support is still in (ie, 2bf1071a8d50928
CVE References
tags: | added: architecture-ppc64le bugnameltc-171273 severity-high targetmilestone-inin1804 |
affects: | ubuntu → linux (Ubuntu) |
Changed in ubuntu-power-systems: | |
assignee: | nobody → Canonical Kernel Team (canonical-kernel-team) |
Changed in linux (Ubuntu): | |
assignee: | nobody → Canonical Kernel Team (canonical-kernel-team) |
Changed in ubuntu-power-systems: | |
importance: | Undecided → High |
Changed in linux (Ubuntu): | |
status: | Incomplete → Triaged |
assignee: | Canonical Kernel Team (canonical-kernel-team) → Joseph Salisbury (jsalisbury) |
Changed in ubuntu-power-systems: | |
status: | New → Triaged |
Changed in linux (Ubuntu): | |
status: | Triaged → In Progress |
Changed in linux (Ubuntu Bionic): | |
status: | New → In Progress |
importance: | Undecided → Medium |
assignee: | nobody → Joseph Salisbury (jsalisbury) |
Changed in ubuntu-power-systems: | |
status: | Triaged → Incomplete |
Changed in ubuntu-power-systems: | |
status: | Incomplete → In Progress |
Changed in ubuntu-power-systems: | |
importance: | High → Critical |
Changed in linux (Ubuntu): | |
importance: | Medium → Critical |
Changed in linux (Ubuntu Bionic): | |
importance: | Medium → Critical |
description: | updated |
Changed in linux (Ubuntu Bionic): | |
status: | In Progress → Fix Committed |
Changed in linux (Ubuntu Cosmic): | |
status: | In Progress → Fix Committed |
Changed in ubuntu-power-systems: | |
status: | In Progress → Fix Committed |
Changed in ubuntu-power-systems: | |
status: | Fix Committed → In Progress |
Changed in linux (Ubuntu): | |
status: | In Progress → Fix Released |
Changed in ubuntu-power-systems: | |
status: | In Progress → Fix Released |
tags: | added: cscc |
Thank you for taking the time to report this bug and helping to make Ubuntu better. It seems that your bug report is not filed about a specific source package though, rather it is just filed against Ubuntu in general. It is important that bug reports be filed about source packages so that people interested in the package can find the bugs about it. You can find some hints about determining what package your bug might be about at https:/ /wiki.ubuntu. com/Bugs/ FindRightPackag e. You might also ask for help in the #ubuntu-bugs irc channel on Freenode.
To change the source package that this bug is filed about visit https:/ /bugs.launchpad .net/ubuntu/ +bug/1792195/ +editstatus and add the package name in the text box next to the word Package.
[This is an automated message. I apologize if it reached you inappropriately; please just reply to this message indicating so.]