Hm, I'm not sure I can give a thorough description since I don't understand enough about the exact workload myself. It is a fairly arbitrary workload generated by our users.
In the end, it boils down to creating, reading and writing many (~20,000) sqlite files of size 16kb - 12GB across many folders and doing random read/write IO to them. The directory structure is that all 20,000 files live inside a directory of one root directory, like so:
/path/00001/file.sqlite
/path/00002/file.sqlite
etc.
The 500GB volume has approximately 275GB of such files. When copying /path/00001/file.sqlite to /path/00001/file.sqlite.new (and so on) with `cat` twice in parallel (via xargs -P2 as in #11), the volume eventually (after multiple hours) hangs. If the copying is resumed from the last file successfully copied before the hang, the hang onset is very rapid.
Hm, I'm not sure I can give a thorough description since I don't understand enough about the exact workload myself. It is a fairly arbitrary workload generated by our users.
In the end, it boils down to creating, reading and writing many (~20,000) sqlite files of size 16kb - 12GB across many folders and doing random read/write IO to them. The directory structure is that all 20,000 files live inside a directory of one root directory, like so:
/path/00001/ file.sqlite file.sqlite
/path/00002/
etc.
The 500GB volume has approximately 275GB of such files. When copying /path/00001/ file.sqlite to /path/00001/ file.sqlite. new (and so on) with `cat` twice in parallel (via xargs -P2 as in #11), the volume eventually (after multiple hours) hangs. If the copying is resumed from the last file successfully copied before the hang, the hang onset is very rapid.