xfs slab objects (memory) leak when xfs shutdown is called
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
linux (Ubuntu) |
Fix Released
|
Low
|
Unassigned | ||
Xenial |
Fix Released
|
Low
|
Unassigned |
Bug Description
[Impact]
* xfs kernel memory leak in case of xfs shutdown due to i/o errors
* if xfs on iscsi, iscsi disconnection and module unload will case mem leak
[Test Case]
* configure tgtd with 1 lun and make it available through tcp/ip
* configure open-iscsi to map this lun
* make sure node.session.
* mount a xfs volume using the lun from tgtd host, run bonnie -d /xfsdir
* in tgtd server, drop iscsi packets and watch client to have i/o errors
* after sometime (depending on timeout) xfs will call shutdown
* make sure the i/o errors led to xfs shutdown (comment #3)
* after shutdown you try to remove xfs module and it will leak
[Regression Potential]
* based on upstream fix
* tested in the same environment
* potential damage to xfs
[Other Info]
Original Description:
#### This scenario is testing [iscsi <-> scsi <-> disk <-> xfs]
[ 551.125604] sd 2:0:0:1: rejecting I/O to offline device
[ 551.125615] sd 2:0:0:1: rejecting I/O to offline device
[ 551.125627] sd 2:0:0:1: rejecting I/O to offline device
[ 551.125639] sd 2:0:0:1: rejecting I/O to offline device
[ 551.135216] XFS (sda1): metadata I/O error: block 0xeffe01 ("xfs_trans_
[ 551.135274] XFS (sda1): page discard on page ffffea0002a89cc0, inode 0x83, offset 6442385408.
# when XFS shuts down because of an error (or offline disk, example):
[ 551.850498] XFS (sda1): xfs_do_
[ 551.850568] XFS (sda1): Log I/O Error Detected. Shutting down filesystem
[ 551.850618] XFS (sda1): xfs_log_force: error -5 returned.
[ 551.850630] XFS (sda1): Failing async write on buffer block 0x77ff08. Retrying async write.
[ 551.850634] XFS (sda1): Failing async write on buffer block 0x77ff10. Retrying async write.
[ 551.850638] XFS (sda1): Failing async write on buffer block 0x77ff01. Retrying async write.
[ 551.853171] XFS (sda1): Please umount the filesystem and rectify the problem(s)
[ 551.874131] XFS (sda1): metadata I/O error: block 0x1dffc49 ("xlog_iodone") error 5 numblks 64
[ 551.877993] XFS (sda1): xfs_do_
[ 551.899036] XFS (sda1): xfs_log_force: error -5 returned.
[ 569.323074] XFS (sda1): xfs_log_force: error -5 returned.
[ 599.403085] XFS (sda1): xfs_log_force: error -5 returned.
[ 629.483111] XFS (sda1): xfs_log_force: error -5 returned.
[ 659.563115] XFS (sda1): xfs_log_force: error -5 returned.
[ 689.643014] XFS (sda1): xfs_log_force: error -5 returned.
# when I execute:
# sudo umount /dev/sda1:
[81634.923043] XFS (sda1): xfs_log_force: error -5 returned.
[81640.739097] XFS (sda1): xfs_log_force: error -5 returned.
[81640.739137] XFS (sda1): Unmounting Filesystem
[81640.739463] XFS (sda1): xfs_log_force: error -5 returned.
[81640.739508] XFS (sda1): xfs_log_force: error -5 returned.
[81640.742741] sd 2:0:0:1: rejecting I/O to offline device
[81640.745576] blk_update_request: 25 callbacks suppressed
[81640.745601] blk_update_request: I/O error, dev sda, sector 0
# i was able to umount and then to remove iscsi disk.
# but if i try to unload the xfs module:
inaddy@
[82211.059301] =======
[82211.063764] BUG xfs_log_ticket (Tainted: G OE ): Objects remaining in xfs_log_ticket on kmem_cache_close()
[82211.067450] -------
[82211.067450]
[82211.070580] INFO: Slab 0xffffea0002eb7640 objects=22 used=1 fp=0xffff8800ba
[82211.074430] INFO: Object 0xffff8800badd9228 @offset=552
[82211.076133] kmem_cache_destroy xfs_log_ticket: Slab cache still has objects
AND
[82211.059301] =======
[82211.063764] BUG xfs_log_ticket (Tainted: G OE ): Objects remaining in xfs_log_ticket on kmem_cache_close()
[82211.067450] -------
[82211.067450]
[82211.070570] Disabling lock debugging due to kernel taint
[82211.070580] INFO: Slab 0xffffea0002eb7640 objects=22 used=1 fp=0xffff8800ba
[82211.073964] CPU: 3 PID: 32230 Comm: rmmod Tainted: G B OE 4.4.0-74-generic #95~14.04.1-Ubuntu
[82211.073970] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Ubuntu-
[82211.073975] 0000000000000000 ffff8800baf53d28 ffffffff813dce3c ffffea0002eb7640
[82211.073984] ffff880036614300 ffff8800baf53e00 ffffffff811dc4b4 ffff880000000020
[82211.073991] ffff8800baf53e10 ffff8800baf53dc0 656a624f02e294c0 616d657220737463
[82211.074013] Call Trace:
[82211.074039] [<ffffffff813dc
[82211.074066] [<ffffffff811dc
[82211.074081] [<ffffffff8118c
[82211.074089] [<ffffffff811dc
[82211.074097] [<ffffffff811df
[82211.074106] [<ffffffff811e1
[82211.074127] [<ffffffff811e1
[82211.074157] [<ffffffff811a8
[82211.074289] [<ffffffffc031a
[82211.074382] [<ffffffffc031a
[82211.074395] [<ffffffff81101
[82211.074404] [<ffffffff81079
[82211.074416] [<ffffffff81806
[82211.074430] INFO: Object 0xffff8800badd9228 @offset=552
[82211.076133] kmem_cache_destroy xfs_log_ticket: Slab cache still has objects
[82211.078221] CPU: 3 PID: 32230 Comm: rmmod Tainted: G B OE 4.4.0-74-generic #95~14.04.1-Ubuntu
[82211.078226] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Ubuntu-
[82211.078230] 0000000000000000 ffff8800baf53e60 ffffffff813dce3c ffff880036614300
[82211.078238] ffff880036614300 ffff8800baf53ec8 ffffffff811a8648 0000000000000000
[82211.078245] 00ff8800baf53eb8 ffff8800baf53e80 ffff8800baf53e80 ffff8800baf53e90
[82211.078253] Call Trace:
[82211.078262] [<ffffffff813dc
[82211.078271] [<ffffffff811a8
[82211.078334] [<ffffffffc031a
[82211.078382] [<ffffffffc031a
[82211.078389] [<ffffffff81101
[82211.078395] [<ffffffff81079
[82211.078401] [<ffffffff81806
[82269.173849] SGI XFS with ACLs, security attributes, realtime, no debug enabled
CVE References
description: | updated |
Changed in linux (Ubuntu): | |
importance: | Medium → Low |
Changed in linux (Ubuntu Xenial): | |
status: | New → In Progress |
Changed in linux (Ubuntu): | |
status: | Confirmed → In Progress |
Changed in linux (Ubuntu Xenial): | |
importance: | Undecided → Low |
assignee: | nobody → Rafael David Tinoco (inaddy) |
milestone: | none → ubuntu-16.04.3 |
Changed in linux (Ubuntu Xenial): | |
status: | In Progress → Fix Committed |
tags: | added: sts |
Changed in linux (Ubuntu): | |
status: | In Progress → Fix Released |
assignee: | Rafael David Tinoco (inaddy) → nobody |
Changed in linux (Ubuntu Xenial): | |
assignee: | Rafael David Tinoco (inaddy) → nobody |
# the following commit solves the issue:
commit af055e37a91d215 d7174d0b84c8679 5ca81086a7
Author: Brian Foster <email address hidden>
Date: Mon Feb 8 15:00:02 2016 +1100
xfs: fix xfs_log_ticket leak in xfs_end_io() after fs shutdown
If the filesystem has shut down, xfs_end_io() currently sets an setfilesize_ ioend() , however, which is skipped in this case.
error on the ioend and proceeds to ioend destruction. The ioend
might contain a truncate transaction if the I/O extended the size of
the file. This transaction is only cleaned up in
xfs_
This results in an xfs_log_ticket leak message when the associate
cache slab is destroyed (e.g., on rmmod).
This was originally reproduced by xfs/141 on a distro kernel. The ically with the 'slab_nomerge' kernel boot option.
problem is reproducible on an upstream kernel, but not easily
detected in current upstream if the xfs_log_ticket cache happens to
be merged with another cache. This can be reproduced more
determinist
Update xfs_end_io() to proceed with normal end I/O processing after
an error is set on an ioend due to fs shutdown. The I/O type-based
processing is already designed to handle an I/O error and ensure
that the ioend is cleaned up correctly.
Signed-off-by: Brian Foster <email address hidden>
Reviewed-by: Dave Chinner <email address hidden>
Signed-off-by: Dave Chinner <email address hidden>