fscache: jobs might hang when fscache disk is full
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
linux (Ubuntu) |
Invalid
|
Undecided
|
Unassigned | ||
Bionic |
Fix Released
|
Undecided
|
Unassigned | ||
Cosmic |
Fix Released
|
Undecided
|
Unassigned |
Bug Description
[Impact]
* fscache issue where jobs get hung when fscache disk is full.
* trivial upstream fix; already applied in X/D, required in B/C:
commit c5a94f434c82 ("fscache: fix race between enablement and
dropping of object").
[Test Case]
* Test kernel verified / regression-tested by reporter.
* Apparently there's no simple test case,
but these are the conditions to hit the problem:
1) The active dataset size is equal to the cache disk size.
The application reads the data over and over again.
2) Disk is near full (90%+)
3) cachefilesd in userspace is trying to cull the old objects
while new objects are being looked up.
4) new cachefiles are created and some fail with no disk space.
5) race in dropping object state machine and
deferred lookup state machine causes the hang.
6) HUNG in fscache_
clear bit FSCACHE_
[Regression Potential]
* Low; contained in fscache; no further fixes applied upstream.
* This patch is applied in a stable tree (linux-4.4.y).
[Original Description]
An user reported an fscache issue where jobs get hung when the fscache disk is full.
After investigation, it's been found to be an issue already reported/fixed upstream,
by commit c5a94f434c82 ("fscache: fix race between enablement and dropping of object").
This patch is required in Bionic and Cosmic, and it's applied in Xenial (via stable) and Disco.
Apparently there's no simple test case, but these are the conditions to hit the problem:
1) The active dataset size is equal to the cache disk size.
The application reads the data over and over again.
2) Disk is near full (90%+)
3) cachefilesd in userspace is trying to cull the old objects
while new objects are being looked up.
4) new cachefiles are created and some fail with no disk space.
5) race in dropping object state machine and
deferred lookup state machine causes the hang.
6) HUNG in fscache_
clear bit FSCACHE_
Changed in linux (Ubuntu): | |
status: | Incomplete → Invalid |
Changed in linux (Ubuntu Bionic): | |
status: | New → Confirmed |
Changed in linux (Ubuntu Cosmic): | |
status: | New → Confirmed |
Changed in linux (Ubuntu Bionic): | |
status: | Confirmed → Fix Committed |
Changed in linux (Ubuntu Cosmic): | |
status: | Confirmed → Fix Committed |
This bug is missing log files that will aid in diagnosing the problem. While running an Ubuntu kernel (not a mainline or third-party kernel) please enter the following command in a terminal window:
apport-collect 1821395
and then change the status of the bug to 'Confirmed'.
If, due to the nature of the issue you have encountered, you are unable to run this command, please add a comment stating that fact and change the bug status to 'Confirmed'.
This change has been made by an automated script, maintained by the Ubuntu Kernel Team.