large_dir in ext4 broken
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
linux (Ubuntu) |
Triaged
|
High
|
Unassigned | ||
Bionic |
Fix Released
|
High
|
Colin Ian King | ||
Focal |
Fix Released
|
High
|
Unassigned | ||
Groovy |
Won't Fix
|
High
|
Unassigned | ||
Hirsute |
Fix Released
|
High
|
Unassigned | ||
Impish |
Won't Fix
|
High
|
Unassigned |
Bug Description
== SRU, Bionic, Focal, Groovy, Hirsute, Impish ==
[Impact]
Creating millions of files on ext4 partition with large_dir support by touching them will eventually trip an ext4 leaf node issue in the index hash. This occurs more frequently when also using smaller block sizes and ends up either with a EXIST or EUCLEAN failure.
This occurs on the restart condition when performing do_split.
[ Fix ]
The fix protects do_split() from the restart condition, making it safe from both current and future ordering of goto statements in earlier sections of the code.
The fix is from a patch sent upstream and cc'd to Ted Tso but didn't appear on the ext4 mailing list presumably because it got marked as SPAM.
[ Test Case ]
Without the fix touching tens of thousands of empty files will trip the issue. It seems to occur more frequently with memory pressure and smaller block sizes, e.g.:
sudo mkdir -p /mnt/tmpfs /mnt/storage
sudo mount -t tmpfs -o size=9000M tmpfs /mnt/tmpfs
sudo dd if=/dev/urandom of=/mnt/
sudo mkfs.ext4 -O large_dir -N 21000000 -O dir_index /mnt/tmpfs/ext4.img -b 1024 -F
sudo mount /mnt/tmpfs/ext4.img /mnt/storage
and compile and run the attached C program (see https:/
[Where problems could occur]
This changes the behaviour of the directory indexing hashing so there is a regression potential that this may introduce subsequent index hashing issues when needed (or not) to do a split. This patch seems to cover all the necessary cases, so I believe this risk is relatively low. I have also tested this on all the kernel series in the SRU with 21,000,000 files so I am confident we have enough test coverage to show the fix is OK.
-------
I believe, I found a bug in ext4 in recent kernel versions.
I stumbled across this while I was trying to restore a backup to a new VM.
How to reproduce this bug:
1. Use a virtual/physical machine with "Ubuntu 18.04.5 LTS" and kernel version 4.15.0-144-generic.
2. add a secondary disk to hold the test files.
3. prepare and mount the filesystem with enabled 'large_dir' flag:
mkfs.ext4 -m0 /dev/sdb1;
tune2fs -O large_dir /dev/sdb1;
mkdir /mnt/storage;
mount /dev/sdb1 /mnt/storage;
4. change to directory and create approx. 16 mio files
cd /mnt/storage;
i=0;
while (( $i < 20000000 )); do
i=$(( $i + 1 ));
(( $i % 1000 == 0 )) && echo $i;
touch file_$i.dat || break;
done
Expected behaviour:
- 20 mio files shoud be created without error
What happened instead:
- The loop aborts with an error message:
# 16263100
# touch: cannot touch 'file_16263173.
- dmesg gives a little more details:
# [Mon Jun 21 03:15:18 2021] EXT4-fs error (device sdb): dx_probe:855: inode #2: block 146221: comm touch: directory leaf block found instead of index block
Additional notes:
- This occurs on kernel version 4.15.0-144-generic
- Not sure, but I believe one test was run on 4.15.0-143-generic and failed too.
- Did not check against 4.15.0-142-generic
- On 4.15.0-141-generic, the problem does not exist. Behaviour is as expected.
CVE References
affects: | linux-signed (Ubuntu) → linux (Ubuntu) |
Changed in linux (Ubuntu Bionic): | |
assignee: | nobody → Stefan Bader (smb) |
status: | New → Triaged |
Changed in linux (Ubuntu Bionic): | |
assignee: | Stefan Bader (smb) → Colin Ian King (colin-king) |
Changed in linux (Ubuntu Bionic): | |
importance: | Undecided → High |
Changed in linux (Ubuntu Focal): | |
importance: | Undecided → High |
status: | New → Triaged |
Changed in linux (Ubuntu Groovy): | |
importance: | Undecided → High |
status: | New → Triaged |
Changed in linux (Ubuntu Hirsute): | |
importance: | Undecided → High |
status: | New → Triaged |
Changed in linux (Ubuntu Impish): | |
importance: | Undecided → High |
status: | New → Triaged |
Changed in linux (Ubuntu Bionic): | |
status: | Triaged → Fix Committed |
Changed in linux (Ubuntu Focal): | |
status: | Triaged → Fix Committed |
Changed in linux (Ubuntu Groovy): | |
status: | Triaged → Fix Committed |
Changed in linux (Ubuntu Hirsute): | |
status: | Triaged → Fix Committed |
C source to reproduce the problem as fast as possible.
The -r option will remove the files.