soft lockup from bcache leading to high load and lockup on trusty
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
linux (Ubuntu) |
In Progress
|
High
|
Unassigned | ||
Trusty |
In Progress
|
High
|
Unassigned |
Bug Description
I have an environment with Dell R630 servers with RAID controllers with two virtual disks and 22 passthru devices. 2 SAS SSDs and 20 HDDs are setup in 2 bcache cachesets with a resulting 20 mounted xfs filesystems running bcache backending an 11 node swift cluster (one zone has 1 fewer nodes). Two of the zones have these nodes as described above and they appear to be exibiting soft lockups in the bcache thread of the kernel causing other kernel threads to go into i/o blocking state an keeping processes on any bcache from being successful. disk access to the virtual disks mounted with out bcache is still possible when this lockup occurs.
https:/
There are several softlockup messages found in the dmesg and many of
the dumpstack are locked inside the bch_writeback_
static int bch_writeback_
{
[...]
while (!kthread_
down_write(
[...]
}
One coredump is found when the kswapd is doing the reclaim about the
xfs inode cache.
__xfs_iflock(
struct xfs_inode *ip)
{
do {
prepare_
if (xfs_isiflocked
io_schedule();
} while (!xfs_iflock_
- Possible fix commits:
1). 9baf30972b55 bcache: fix for gc and write-back race
https:/
- Related discussions:
1). Re: [PATCH] md/bcache: Fix a deadlock while calculating writeback rate
https:/
2). Re: hang during suspend to RAM when bcache cache device is attached
https:/
We are running trusty/mitaka swift storage on these nodes with 4.4.0-111 kernel (linux-
tags: | added: kernel-da-key |
Changed in linux (Ubuntu): | |
status: | Incomplete → Triaged |
importance: | Undecided → High |
Changed in linux (Ubuntu Trusty): | |
status: | New → Triaged |
importance: | Undecided → High |
Changed in linux (Ubuntu): | |
assignee: | nobody → Joseph Salisbury (jsalisbury) |
Changed in linux (Ubuntu Trusty): | |
assignee: | nobody → Joseph Salisbury (jsalisbury) |
Changed in linux (Ubuntu): | |
status: | Triaged → In Progress |
Changed in linux (Ubuntu Trusty): | |
status: | Triaged → In Progress |
Changed in linux (Ubuntu Trusty): | |
assignee: | Joseph Salisbury (jsalisbury) → nobody |
Changed in linux (Ubuntu): | |
assignee: | Joseph Salisbury (jsalisbury) → nobody |
tags: | added: cscc |
This bug is missing log files that will aid in diagnosing the problem. While running an Ubuntu kernel (not a mainline or third-party kernel) please enter the following command in a terminal window:
apport-collect 1757277
and then change the status of the bug to 'Confirmed'.
If, due to the nature of the issue you have encountered, you are unable to run this command, please add a comment stating that fact and change the bug status to 'Confirmed'.
This change has been made by an automated script, maintained by the Ubuntu Kernel Team.