It seems the crash happens in the dummy handler for an interrupt that is used for the unlock of a paravirtualized spinlock. It is set when a vcpu is initialized but then the interrupt is disabled. So I assume an interrupt happens between request_irq and disable_irq.
I tried 2 things on jammy (5.15.0 kernel on top of the latest in proposed:
It seems the crash happens in the dummy handler for an interrupt that is used for the unlock of a paravirtualized spinlock. It is set when a vcpu is initialized but then the interrupt is disabled. So I assume an interrupt happens between request_irq and disable_irq.
I tried 2 things on jammy (5.15.0 kernel on top of the latest in proposed:
1. disabled the paravirtualized spinlock (used xen_nopvspin as boot param) and as expected the crash did not occur xen/spinlock. c:62! handler+ 0x0/0x10 180e28 EFLAGS: 00010046 0(0000) GS:ffff8e42a828 0000(0000) knlGS:000000000 0000000 log_lvl+ 0x1d6/0x2ea log_lvl+ 0x1d6/0x2ea irq_event_ percpu+ 0x33/0x80 part.0+ 0x23/0x29 cold+0x8/ 0xd trap+0x6f/ 0xb0 wait+0x90/ 0x90 op+0x53/ 0x70 wait+0x90/ 0x90 invalid_ op+0x1b/ 0x20 wait+0x90/ 0x90 wait+0x90/ 0x90 wait+0x90/ 0x90 irq_event_ percpu+ 0x42/0x170 irq_event_ percpu+ 0x33/0x80 percpu_ irq+0x3d/ 0x60 irq_desc+ 0x3e/0x50 handle_ irq+0x1f/ 0x30 irq_for_ port+0x8d/ 0x160 irq_data+ 0xe/0x20 2l_handle_ events+ 0x16e/0x3b0 do_upcall+ 0x6c/0xb0 evtchn_ do_upcall+ 0xe/0x20 xen_hvm_ callback+ 0x26/0x40 xen_hvm_ callback+ 0x7b/0x90 xen_hvm_ callback+ 0x1b/0x20 spin_unlock_ irqrestore+ 0x25/0x30 153cc8 EFLAGS: 00000206 unlock_ irqrestore+ 0xe/0x30 irq+0x3a8/ 0x780 alloc_trace+ 0x19e/0x2e0 threaded_ irq+0x112/ 0x180 to_irqhandler+ 0xe7/0x3d0 wait+0x90/ 0x90 lock_cpu+ 0xb3/0x150 pmu_disable_ event+0xe0/ 0xe0 up_online+ 0xe/0x20 callback+ 0x10f/0x420 alloc_request+ 0xc0/0xc0 fun+0xc0/ 0x1c0 unlock_ irqrestore+ 0xe/0x30 thread_ fn+0xba/ 0x160 register_ percpu_ thread+ 0x140/0x140 struct+ 0x50/0x50 fork+0x22/ 0x30
2. Tried the irq flag: IRQ_NOAUTOEN, basically this is used to not enable the irq when requested.
Unfortunately crash happened again:
[ 434.016304] smpboot: Booting Node 0 Processor 2 APIC 0x1
[ 434.020165] ------------[ cut here ]------------
[ 434.020166] kernel BUG at arch/x86/
[ 434.021628] invalid opcode: 0000 [#1] SMP PTI
[ 434.022901] CPU: 2 PID: 25 Comm: cpuhp/2 Kdump: loaded Not tainted 5.15.0-88-generic #98-Ubuntu
[ 434.025279] Hardware name: Xen HVM domU, BIOS 4.11.amazon 08/24/2006
[ 434.027056] RIP: 0010:dummy_
[ 434.028343] Code: e4 e8 54 e1 7a 00 8b 75 e4 84 c0 74 d2 44 89 ef e8 35 a5 7a 00 eb d2 44 89 ef e8 cb e1 7a 00 eb c8 66 0f 1f 84 00 00 00 00 00 <0f> 0b 66 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 00 55 48 89 e5 41 57
[ 434.033297] RSP: 0000:ffffa06a40
[ 434.034830] RAX: ffffffffa5637ef0 RBX: ffff8e41c88b9b80 RCX: 0000000000000000
[ 434.036839] RDX: 0000000000000018 RSI: 0000000000000000 RDI: 0000000000000041
[ 434.038839] RBP: ffffa06a40180e68 R08: 0000000000000000 R09: ffffffffa757bf68
[ 434.040833] R10: 0000000000000000 R11: ffffa06a40180ff8 R12: 0000000000000000
[ 434.042818] R13: ffffa06a40180e7c R14: 0000000000000041 R15: ffffa06a40180f98
[ 434.044855] FS: 000000000000000
[ 434.047083] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 434.048762] CR2: 0000000000000000 CR3: 0000000149610001 CR4: 00000000001706e0
[ 434.050778] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 434.052767] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 434.054757] Call Trace:
[ 434.056028] <IRQ>
[ 434.057196] ? show_trace_
[ 434.058984] ? show_trace_
[ 434.060765] ? handle_
[ 434.062656] ? show_regs.
[ 434.064357] ? __die_body.
[ 434.065977] ? __die+0x2b/0x37
[ 434.067413] ? die+0x30/0x60
[ 434.068818] ? do_trap+0xbe/0x100
[ 434.070345] ? do_error_
[ 434.071947] ? xen_qlock_
[ 434.073581] ? exc_invalid_
[ 434.075205] ? xen_qlock_
[ 434.076832] ? asm_exc_
[ 434.078546] ? xen_qlock_
[ 434.080164] ? xen_qlock_
[ 434.081788] ? xen_qlock_
[ 434.083411] ? __handle_
[ 434.085302] handle_
[ 434.087097] handle_
[ 434.088754] handle_
[ 434.090383] generic_
[ 434.092056] handle_
[ 434.093769] ? irq_get_
[ 434.095422] evtchn_
[ 434.097253] __xen_evtchn_
[ 434.099016] xen_hvm_
[ 434.100797] __sysvec_
[ 434.102625] sysvec_
[ 434.104408] </IRQ>
[ 434.105591] <TASK>
[ 434.106783] asm_sysvec_
[ 434.108651] RIP: 0010:_raw_
[ 434.110734] Code: eb 8d cc cc cc 0f 1f 44 00 00 55 48 89 e5 e8 4a 08 37 ff 66 90 f7 c6 00 02 00 00 75 06 5d c3 cc cc cc cc fb 66 0f 1f 44 00 00 <5d> c3 cc cc cc cc 0f 1f 44 00 00 0f 1f 44 00 00 55 48 89 e5 8b 07
[ 434.117127] RSP: 0000:ffffa06a40
[ 434.119113] RAX: 0000000000000001 RBX: ffff8e41ea1bb560 RCX: 000000000002dc00
[ 434.121575] RDX: 0000000000000001 RSI: 0000000000000246 RDI: ffff8e41ea1bb4a4
[ 434.124053] RBP: ffffa06a40153cc8 R08: 0000000000000000 R09: ffffffffa757bf68
[ 434.126521] R10: 0000000000000246 R11: ffff8e41ea1bb4a4 R12: ffff8e41ea1bb400
[ 434.128994] R13: ffff8e41c88b9b80 R14: 0000000000000041 R15: 0000000000000000
[ 434.131462] ? _raw_spin_
[ 434.133356] __setup_
[ 434.134923] ? kmem_cache_
[ 434.136773] request_
[ 434.138534] bind_ipi_
[ 434.140335] ? xen_qlock_
[ 434.141965] xen_init_
[ 434.143641] ? zhaoxin_
[ 434.145512] xen_cpu_
[ 434.147149] cpuhp_invoke_
[ 434.148932] ? blk_mq_
[ 434.150700] cpuhp_thread_
[ 434.152354] ? _raw_spin_
[ 434.154258] smpboot_
[ 434.155933] ? smpboot_
[ 434.157961] kthread+0x12a/0x150
[ 434.159440] ? set_kthread_
[ 434.161154] ret_from_
[ 434.162727] </TASK>
[ 434.163938] Modules linked in: cpuid tls binfmt_misc nls_iso8859_1 crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel crypto_simd cryptd psmouse ixgbevf input_leds serio_raw dm_multipath scsi_dh_rdac scsi_dh_emc scsi_dh_alua sch_fq_codel drm efi_pstore ip_tables x_tables autofs4