possible deadlock while using the cgroup freezer on a container with NFS-based workload
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
linux (Ubuntu) |
In Progress
|
High
|
Seth Forshee |
Bug Description
Hi guys,
For background: I'm running a container with an NFS filesystem bind mounted into it. The workload I'm running is iozone, a filesystem benchmarking tool. While running this workload, I attempt to freeze the container, which gets stuck in the FREEZING state. After a while, I get:
Jul 1 01:45:14 juju-19f8e3-15 kernel: [206520.104156] INFO: task iozone:20035 blocked for more than 120 seconds.
Jul 1 01:45:14 juju-19f8e3-15 kernel: [206520.111056] Tainted: P O 4.4.0-24-generic #43-Ubuntu
Jul 1 01:45:14 juju-19f8e3-15 kernel: [206520.118053] "echo 0 > /proc/sys/
Jul 1 01:45:14 juju-19f8e3-15 kernel: [206520.126110] iozone D ffff880015673e18 0 20035 20005 0x00000104
Jul 1 01:45:14 juju-19f8e3-15 kernel: [206520.126116] ffff880015673e18 ffff880000000010 ffff880045a21b80 ffff880037776e00
Jul 1 01:45:14 juju-19f8e3-15 kernel: [206520.126118] ffff880015674000 ffff8800179d6e54 ffff880037776e00 00000000ffffffff
Jul 1 01:45:14 juju-19f8e3-15 kernel: [206520.126120] ffff8800179d6e58 ffff880015673e30 ffffffff81821b15 ffff8800179d6e50
Jul 1 01:45:14 juju-19f8e3-15 kernel: [206520.126121] Call Trace:
Jul 1 01:45:14 juju-19f8e3-15 kernel: [206520.126129] [<ffffffff81821
Jul 1 01:45:14 juju-19f8e3-15 kernel: [206520.126131] [<ffffffff81821
Jul 1 01:45:14 juju-19f8e3-15 kernel: [206520.126134] [<ffffffff81823
Jul 1 01:45:14 juju-19f8e3-15 kernel: [206520.126136] [<ffffffff81823
Jul 1 01:45:14 juju-19f8e3-15 kernel: [206520.126139] [<ffffffff8121d
Jul 1 01:45:14 juju-19f8e3-15 kernel: [206520.126142] [<ffffffff8121d
Jul 1 01:45:14 juju-19f8e3-15 kernel: [206520.126146] [<ffffffff81825
It looks like the task is actually stuck in generic fs code, not anything NFS specific, but perhaps that's a relevant detail. Anyway:
ubuntu@
[<ffffffff8121d
[<ffffffff8121d
[<ffffffff81825
[<fffffffffffff
The container and host are both xenial:
ubuntu@
Linux juju-19f8e3-15 4.4.0-24-generic #43-Ubuntu SMP Wed Jun 8 19:27:37 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux
Finally, I don't have a good reproducer for this. It's pretty rare, as I'm running this benchmark in a loop, and over thousands of runs I've seen this exactly once.
I'll leave these hosts up for a bit if there's any other interesting bits of info to collect.
Changed in linux (Ubuntu): | |
importance: | Undecided → High |
This bug is missing log files that will aid in diagnosing the problem. From a terminal window please run:
apport-collect 1598285
and then change the status of the bug to 'Confirmed'.
If, due to the nature of the issue you have encountered, you are unable to run this command, please add a comment stating that fact and change the bug status to 'Confirmed'.
This change has been made by an automated script, maintained by the Ubuntu Kernel Team.