The test set is the 'full' suite (default), not just the 'quick' group/suite.
The fail set is flaky (some tests fail/pass inconsistently), so the 'consistently fail set' (i.e., fail every time) across 10-20 runs has been compared between the original/patched kernels.
Also looked for increased likelyhood to fail (i.e., flaky tests that fail more often with the patches.)
More Details:
---
The xfstests output/log includes the fail set in the 'Failures:' line, and the kernel version in the 'PLATFORM' line.
For example:
Those two lines from each log/run are used to identify, per kernel version (original/patched):
- how many runs have been done;
- which tests fail consistently (i.e., in all runs);
- which tests fail inconsistently (i.e., flaky) and how many times (to compare likelyhood to fail)
3) xfstests (aka fstests)
Compared the 'consistently fail set' and 'likelyhood to fail' in flaky tests (see below.)
No regressions.
Details:
---
Source: kernel. org/pub/ scm/fs/ xfs/xfstests- dev.git
- git://git.
- commit 31f6949f ("ext4: verify unwritten extent conversion in buff-io")
The test set is the 'full' suite (default), not just the 'quick' group/suite.
The fail set is flaky (some tests fail/pass inconsistently), so the 'consistently fail set' (i.e., fail every time) across 10-20 runs has been compared between the original/patched kernels.
Also looked for increased likelyhood to fail (i.e., flaky tests that fail more often with the patches.)
More Details:
---
The xfstests output/log includes the fail set in the 'Failures:' line, and the kernel version in the 'PLATFORM' line.
For example:
$ grep -e ^Fail -e ^PLATFORM xfstests. log.2020- 10-27-13- 26-44
PLATFORM -- Linux/s390x mfo-s390x-focal 5.4.0-52-generic #57-Ubuntu SMP Thu Oct 15 10:52:40 UTC 2020
Failures: btrfs/153 btrfs/200 btrfs/204 btrfs/205 btrfs/213 btrfs/220 btrfs/221 btrfs/222 btrfs/226 generic/166 generic/175 generic/260 generic/286 generic/301 generic/465 generic/484 generic/610
Those two lines from each log/run are used to identify, per kernel version (original/patched):
- how many runs have been done;
- which tests fail consistently (i.e., in all runs);
- which tests fail inconsistently (i.e., flaky) and how many times (to compare likelyhood to fail)