fsnotify: Fix fsnotify_mark_connector race
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
Linux |
Incomplete
|
Undecided
|
Unassigned | ||
linux (Ubuntu) |
Fix Released
|
Undecided
|
Unassigned | ||
Xenial |
Invalid
|
Undecided
|
Unassigned | ||
Artful |
Fix Released
|
Medium
|
Unassigned | ||
Bionic |
Fix Released
|
Medium
|
Unassigned | ||
linux-azure (Ubuntu) |
Fix Released
|
Undecided
|
Kamal Mostafa | ||
Xenial |
Fix Released
|
Undecided
|
Unassigned | ||
Artful |
Invalid
|
Undecided
|
Unassigned | ||
Bionic |
Fix Released
|
Undecided
|
Unassigned |
Bug Description
On Azure we have had sporadic cases of soft lockups in fsnotify that may very well be mitigated by the following fix. The LKML thread is "kernel panics with 4.14.X".
This should be applied to 4.13 and 4.15 versions of the linux-azure kernel, and possibly the 4.15 generic kernel in bionic as well.
-----
fsnotify() acquires a reference to a fsnotify_
the SRCU-protected pointer to_tell-
appears that no precautions are taken in fsnotify_put_mark() to
ensure that fsnotify() drops its reference to this
fsnotify_
field. This can result in fsnotify_put_mark() assigning a value
to a connector's 'destroy_next' field right before fsnotify() tries to
traverse the linked list referenced by the connector's 'list' field.
Since these two fields are members of the same union, this behavior
results in a kernel panic.
This issue is resolved by moving the connector's 'destroy_next' field
into the object pointer union. This should work since the object pointer
access is protected by both a spinlock and the value of the 'flags'
field, and the 'flags' field is cleared while holding the spinlock in
fsnotify_put_mark() before 'destroy_next' is updated. It shouldn't be
possible for another thread to accidentally read from the object pointer
after the 'destroy_next' field is updated.
The offending behavior here is extremely unlikely; since
fsnotify_put_mark() removes references to a connector (specifically,
it ensures that the connector is unreachable from the inode it was
formerly attached to) before updating its 'destroy_next' field, a
sizeable chunk of code in fsnotify_put_mark() has to execute in the
short window between when fsnotify() acquires the connector reference
and saves the value of its 'list' field. On the HEAD kernel, I've only
been able to reproduce this by inserting a udelay(1) in fsnotify().
However, I've been able to reproduce this issue without inserting a
udelay(1) anywhere on older unmodified release kernels, so I believe
it's worth fixing at HEAD.
References: https:/
Fixes: 08991e83b728663
CC: <email address hidden>
Signed-off-by: Robert Kolchmeyer <email address hidden>
Signed-off-by: Jan Kara <email address hidden>
tags: | added: patch |
Changed in linux-azure (Ubuntu): | |
status: | New → Confirmed |
Changed in linux-azure (Ubuntu): | |
assignee: | nobody → Kamal Mostafa (kamalmostafa) |
status: | Confirmed → In Progress |
Changed in linux-azure (Ubuntu Xenial): | |
status: | New → In Progress |
Changed in linux-azure (Ubuntu Bionic): | |
status: | New → In Progress |
Changed in linux (Ubuntu Xenial): | |
status: | New → Invalid |
Changed in linux-azure (Ubuntu Artful): | |
status: | New → Invalid |
Changed in linux (Ubuntu Bionic): | |
status: | Incomplete → Confirmed |
Changed in linux (Ubuntu Artful): | |
importance: | Undecided → Medium |
status: | Incomplete → Fix Committed |
Changed in linux (Ubuntu Bionic): | |
importance: | Undecided → Medium |
status: | Confirmed → Fix Committed |
Changed in linux (Ubuntu): | |
status: | New → Invalid |
tags: | added: cscc |
This bug is awaiting verification that the kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification- needed- bionic' to 'verification- done-bionic' . If the problem still exists, change the tag 'verification- needed- bionic' to 'verification- failed- bionic' .
If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed.
See https:/ /wiki.ubuntu. com/Testing/ EnableProposed for documentation how to enable and use -proposed. Thank you!