If you attempt to balance a btrfs filesystem that is nearly full, and this filesystem has had a lot of small, medium and large files created and deleted, such that the b-tree needs to be rotated, when the balance fails due to not having enough free space, the kernel oops, and the btrfs filesystem hangs.
It doesn't appear to cause any filesystem corruption, and is reproducible every time on affected filesystems.
I don't see this behaviour on any upstream kernel, and the first kernel to show this behaviour is 4.15.0-109-generic. The current 4.15.0-145-generic is still affected.
I believe that this is a regression introduced in the fixing of CVE-2019-19036.
[Testcase]
I haven't reliably been able to create a script which places a btrfs filesystem into the state necessary to reproduce this issue, so I have just provided my qcow2 image with my btrfs filesystem which reproduces the issue 100% of the time.
Download the image from here (warning size is 8.0gb):
Make a Ubuntu 18.04 VM. Attach the ubuntu18.04-server-2.qcow2 image to a new virtio disk. Note, ubuntu18.04-server-2.qcow2 does not have an operating system, it is just a data only volume.
If you install the test kernel from the following ppa:
You should see this instead:
$ sudo btrfs filesystem balance start --full-balance /mnt
ERROR: error during balancing '/mnt': No space left on device
There may be more info in syslog - try dmesg | tail
Checking dmesg shows no kernel oops, and just info about the volume being too full to balance:
I found the problem to be introduced in 4.15.0-109-generic, and 4.15.0-108-generic and earlier worked fine, which means we introduced a regression somewhere.
I bisected the problem down to the following commit:
You will see the 4.15 backport has calls to free_extent_buffer() and btrfs_put_fs_root(). Now, btrfs_put_fs_root() was renamed to btrfs_put_root() in the newer patches, and contains logic to free relocated roots, so I think we might not need the calls to free_extent_buffer() to free the extents first, since it might be handled later.
The core issue is that we hit a general protection fault when attempting to access a root node, which means we have freed a root node we shouldn't have.
If we look at the backport in 5.4.y, aka, the one in Focal:
It seems upstream -stable omitted the calls to btrfs_put_root() entirely, and we don't need the calls to free_extent_buffer() because of it.
If I revert 6f536ce7a978531d38a21d092394616cefb54436 from ubuntu-bionic, and cherry-pick ecaee3a76ea998bc2fe20f056eb27f9bc837d116 from ubuntu-focal, and build, the problem no longer reproduces.
[Where problems could occur]
If a regression were to occur, it would affect users of btrfs filesystems, and would likely show during a routine balance operation. Since the issue is triggered during the cancellation of a balance operation, problems might occur for users with nearly full filesystems or filesystems that have existing corruption.
We are replacing a patch that was backported during the fixing of CVE-2019-19036, and replacing it with a backport provided by upstream developers, which cherry picks from 5.4.y to Bionic. The patch in 5.4.y is well tested by the community and is currently in the Focal kernel.
With all modifications to btrfs, there is a risk of data corruption and filesystem corruption for all btrfs users, since balances happen automatically and on a regular basis. If a regression does happen, users should remount their filesystems with the "nobalance" flag, backup their data, and attempt a repair if necessary.
[Other info]
A community member has hit this issue before I did, and has reported it upstream to linux-btrfs here, although no one knew what was happening:
[Impact]
If you attempt to balance a btrfs filesystem that is nearly full, and this filesystem has had a lot of small, medium and large files created and deleted, such that the b-tree needs to be rotated, when the balance fails due to not having enough free space, the kernel oops, and the btrfs filesystem hangs.
It doesn't appear to cause any filesystem corruption, and is reproducible every time on affected filesystems.
The following oops is generated:
general protection fault: 0000 [#1] SMP PTI set_root_ node+0x5/ 0x60 [btrfs] 0a79e0 EFLAGS: 00010282 0(0000) GS:ffff8d7f7fc0 0000(0000) knlGS:000000000 0000000 fs_roots+ 0x130/0x1b0 [btrfs] delayed_ refs.part. 70+0x80/ 0x190 [btrfs] commit_ transaction+ 0x42c/0x910 [btrfs] on+0x191/ 0x430 [btrfs] block_group+ 0x1e7/0x640 [btrfs] relocate_ block_group+ 0x18f/0x280 [btrfs] relocate_ chunk+0x38/ 0xd0 [btrfs] balance+ 0x972/0xcd0 [btrfs] balance_ item.isra. 35+0x391/ 0x3c0 [btrfs] balance+ 0x32c/0x5a0 [btrfs] ioctl_balance+ 0x320/0x390 [btrfs] ioctl+0x5a6/ 0x2490 [btrfs] add_active_ or_unevictable+ 0x36/0xb0 mm_fault+ 0x9fd/0x1290 ioctl+0xa8/ 0x630 get_supported_ features+ 0x30/0x30 [btrfs] ioctl+0xa8/ 0x630 fault+0x2a1/ 0x4b0 0x79/0x90 64+0x73/ 0x130 SYSCALL_ 64_after_ hwframe+ 0x41/0xa6 d03e38 EFLAGS: 00000246 ORIG_RAX: 0000000000000010 root_node+ 0x5/0x60 [btrfs] RSP: ffffb3db890a79e0
CPU: 0 PID: 18440 Comm: btrfs Not tainted 4.15.0-136-generic #140-Ubuntu
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.14.0-2 04/01/2014
RIP: 0010:btrfs_
RSP: 0018:ffffb3db89
RAX: ffff8d7f73861ad0 RBX: ffff8d7f78455708 RCX: ffff8d7f6d9a5390
RDX: ffff8d7f73861ad0 RSI: a023775cfc0348a3 RDI: ffff8d7f6d9a5028
RBP: ffffb3db890a7a78 R08: 0000000000000044 R09: 0000000000000228
R10: ffff8d7f6d9a5000 R11: 0000000000000010 R12: ffffb3db890a7a08
R13: ffff8d7f6d9a5000 R14: ffff8d7f6d9a5028 R15: ffff8d7f74560000
FS: 00007f48d84498c
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007fe4fbc1f000 CR3: 00000001799fc001 CR4: 0000000000160ef0
Call Trace:
? commit_
? btrfs_run_
btrfs_
? start_transacti
relocate_
btrfs_
btrfs_
__btrfs_
? insert_
btrfs_
btrfs_
btrfs_
? lru_cache_
? __handle_
do_vfs_
? btrfs_ioctl_
? do_vfs_
? __do_page_
SyS_ioctl+
do_syscall_
entry_
RIP: 0033:0x7f48d7228317
RSP: 002b:00007ffd76
RAX: ffffffffffffffda RBX: 0000000000000001 RCX: 00007f48d7228317
RDX: 00007ffd76d03ec8 RSI: 00000000c4009420 RDI: 0000000000000003
RBP: 00007ffd76d03ec8 R08: 0000000000000078 R09: 0000000000000000
R10: 0000562086e7f010 R11: 0000000000000246 R12: 0000000000000003
R13: 00007ffd76d057cb R14: 0000000000000002 R15: 0000000000000000
Code: 4d 85 e4 0f 84 56 fe ff ff 4d 89 04 24 41 c6 44 24 08 84 4d 89 4c 24 09 e9 42 fe ff ff 0f 0b e8 02 24 5e e0 66 90 0f 1f 44 00 00 <48> 8b 06 48 8b 0d c9 d4 99 e1 48 8b 15 d2 d4 99 e1 55 48 89 87
RIP: btrfs_set_
I don't see this behaviour on any upstream kernel, and the first kernel to show this behaviour is 4.15.0-109-generic. The current 4.15.0-145-generic is still affected.
I believe that this is a regression introduced in the fixing of CVE-2019-19036.
[Testcase]
I haven't reliably been able to create a script which places a btrfs filesystem into the state necessary to reproduce this issue, so I have just provided my qcow2 image with my btrfs filesystem which reproduces the issue 100% of the time.
Download the image from here (warning size is 8.0gb):
https:/ /people. canonical. com/~mruffell/ sf311164/ ubuntu18. 04-server- 2.qcow2
Make a Ubuntu 18.04 VM. Attach the ubuntu18. 04-server- 2.qcow2 image to a new virtio disk. Note, ubuntu18. 04-server- 2.qcow2 does not have an operating system, it is just a data only volume.
Mount the volume:
$ sudo mount /dev/vdb /mnt
Attempt to balance:
$ sudo btrfs filesystem balance start --full-balance /mnt
Segmentation fault (core dumped)
Check dmesg for kernel oops: /paste. ubuntu. com/p/wjJNqKBCf h/
https:/
If you install the test kernel from the following ppa:
You should see this instead:
$ sudo btrfs filesystem balance start --full-balance /mnt
ERROR: error during balancing '/mnt': No space left on device
There may be more info in syslog - try dmesg | tail
Checking dmesg shows no kernel oops, and just info about the volume being too full to balance:
https:/ /paste. ubuntu. com/p/4J8Gq2dtz 4/
[Fix]
I found the problem to be introduced in 4.15.0-109-generic, and 4.15.0-108-generic and earlier worked fine, which means we introduced a regression somewhere.
I bisected the problem down to the following commit:
ubuntu-bionic 6f536ce7a978531 d38a21d09239461 6cefb54436 /paste. ubuntu. com/p/4qfWCM8yk h/
Author: Qu Wenruo <email address hidden>
Date: Tue May 19 10:13:20 2020 +0800
Subject btrfs: reloc: fix reloc root leak and NULL pointer dereference
Link: https:/
Unfortunately, I believe this is a bad backport. If you examine the original upstream commit:
commit 51415b6c1b117e2 23bc083e30af675 cb5c5498f3 /github. com/torvalds/ linux/commit/ 51415b6c1b117e2 23bc083e30af675 cb5c5498f3
Author: Qu Wenruo <email address hidden>
Date: Tue May 19 10:13:20 2020 +0800
Subject: btrfs: reloc: fix reloc root leak and NULL pointer dereference
Link: https:/
You will see the 4.15 backport has calls to free_extent_ buffer( ) and btrfs_put_ fs_root( ). Now, btrfs_put_fs_root() was renamed to btrfs_put_root() in the newer patches, and contains logic to free relocated roots, so I think we might not need the calls to free_extent_ buffer( ) to free the extents first, since it might be handled later.
The core issue is that we hit a general protection fault when attempting to access a root node, which means we have freed a root node we shouldn't have.
If we look at the backport in 5.4.y, aka, the one in Focal:
ubuntu-focal ecaee3a76ea998b c2fe20f056eb27f 9bc837d116 /paste. ubuntu. com/p/PZrMqVt8Y k/
Author: Qu Wenruo <email address hidden>
Date: Tue May 19 10:13:20 2020 +0800
Subject: btrfs: reloc: fix reloc root leak and NULL pointer dereference
Link: https:/
It seems upstream -stable omitted the calls to btrfs_put_root() entirely, and we don't need the calls to free_extent_ buffer( ) because of it.
If I revert 6f536ce7a978531 d38a21d09239461 6cefb54436 from ubuntu-bionic, and cherry-pick ecaee3a76ea998b c2fe20f056eb27f 9bc837d116 from ubuntu-focal, and build, the problem no longer reproduces.
[Where problems could occur]
If a regression were to occur, it would affect users of btrfs filesystems, and would likely show during a routine balance operation. Since the issue is triggered during the cancellation of a balance operation, problems might occur for users with nearly full filesystems or filesystems that have existing corruption.
We are replacing a patch that was backported during the fixing of CVE-2019-19036, and replacing it with a backport provided by upstream developers, which cherry picks from 5.4.y to Bionic. The patch in 5.4.y is well tested by the community and is currently in the Focal kernel.
With all modifications to btrfs, there is a risk of data corruption and filesystem corruption for all btrfs users, since balances happen automatically and on a regular basis. If a regression does happen, users should remount their filesystems with the "nobalance" flag, backup their data, and attempt a repair if necessary.
[Other info]
A community member has hit this issue before I did, and has reported it upstream to linux-btrfs here, although no one knew what was happening:
https:/ /www.spinics. net/lists/ linux-btrfs/ msg103367. html