Update: apparently CONFIG_KFENCE can only reduce the probability to trigger the bug, so it's not a reliable fix. According to all the traces that we got the soft lockup is always happening to any SLUB allocation function (__kmalloc, kmem_cache_alloc, and similar), and the instruction pointer is on a static branch call. And static branches are used only by KFENCE in SLUB, hence the test of disabling KFENCE.
Update: apparently CONFIG_KFENCE can only reduce the probability to trigger the bug, so it's not a reliable fix. According to all the traces that we got the soft lockup is always happening to any SLUB allocation function (__kmalloc, kmem_cache_alloc, and similar), and the instruction pointer is on a static branch call. And static branches are used only by KFENCE in SLUB, hence the test of disabling KFENCE.