Issue found on node "onibi" with J-intel-iotg 5.15.0-1026.31 this cycle.
The veth.sh test in net category will hang and timeout, causing test report incomplete.
I can see some traces in dmesg with manual test.
ubuntu@onibi:~/autotest/client/tmp/ubuntu_kernel_selftests/src/linux/tools/testing/selftests/net$ sudo ./veth.sh
default - gro flag ok
- peer gro flag ok
- tso flag ok
- peer tso flag ok
- aggregation ok
- aggregation with TSO off ok
with gro on - gro flag ok
- peer gro flag ok
- tso flag ok
- peer tso flag ok
- aggregation with TSO off ok
default channels ok
with gro enabled on link down - gro flag ok
- peer gro flag ok
- tso flag ok
- peer tso flag ok
- aggregation with TSO off ok
setting tx channels ok
setting both rx and tx channels ok
bad setting: combined channels ok
setting invalid channels nr fail rx:3:3 tx:3:5 combined:n/a:n/a
bad setting: XDP with RX nr less than TX ok
(hangs here)
Issue found on node "onibi" with J-intel-iotg 5.15.0-1026.31 this cycle.
The veth.sh test in net category will hang and timeout, causing test report incomplete.
I can see some traces in dmesg with manual test.
ubuntu@ onibi:~ /autotest/ client/ tmp/ubuntu_ kernel_ selftests/ src/linux/ tools/testing/ selftests/ net$ sudo ./veth.sh
default - gro flag ok
- peer gro flag ok
- tso flag ok
- peer tso flag ok
- aggregation ok
- aggregation with TSO off ok
with gro on - gro flag ok
- peer gro flag ok
- tso flag ok
- peer tso flag ok
- aggregation with TSO off ok
default channels ok
with gro enabled on link down - gro flag ok
- peer gro flag ok
- tso flag ok
- peer tso flag ok
- aggregation with TSO off ok
setting tx channels ok
setting both rx and tx channels ok
bad setting: combined channels ok
setting invalid channels nr fail rx:3:3 tx:3:5 combined:n/a:n/a
bad setting: XDP with RX nr less than TX ok
(hangs here)
dmesg output: 1026-intel- iotg #31-Ubuntu xdp+0x18f/ 0x1e0 [veth] 54f420 EFLAGS: 00010282 0(0000) GS:ffff8c66f764 0000(0000) knlGS:000000000 0000000 install+ 0x66/0xf0 attach+ 0x1fc/0x590 get+0x1f/ 0xe0 xdp_fd+ 0x200/0x240 0xba2/0xc70 alias+0x35/ 0x50 newlink+ 0x61e/0xa20 sock_rcv_ skb+0x2f/ 0x50 tail+0x48/ 0x60 readable+ 0x4b/0x80 sendskb+ 0x62/0x80 unicast+ 0x2fb/0x340 0x398/0x420 alloc_trace+ 0x17e/0x2a0 0x49/0x70 rcv_msg+ 0x15d/0x400 isra.0+ 0x130/0x130 rcv_skb+ 0x56/0x100 rcv+0x15/ 0x20 unicast+ 0x223/0x340 sendmsg+ 0x24b/0x4c0 0x69/0x70 sendmsg+ 0x252/0x290 iovec+0x31/ 0x40 copy_msghdr+ 0x7f/0xa0 sendmsg+ 0x81/0xc0 fixup+0x72/ 0x170 handle_ notify_ resume+ 0x2d/0xc0 user_mode_ loop+0x10d/ 0x160 user_mode_ prepare+ 0x37/0xb0 exit_to_ user_mode+ 0x27/0x50 close+0x11/ 0x50 0x62/0xc0 sendmsg+ 0x1d/0x30 64+0x5c/ 0xc0 fault+0x89/ 0x170 64_after_ hwframe+ 0x61/0xcb ca3678 EFLAGS: 00000246 ORIG_RAX: 000000000000002e xdp+0x18f/ 0x1e0 [veth] 54f420 EFLAGS: 00010282 0(0000) GS:ffff8c66f764 0000(0000) knlGS:000000000 0000000
[ 547.520923] BUG: unable to handle page fault for address: ffffb73800000001
[ 547.520999] #PF: supervisor write access in kernel mode
[ 547.521045] #PF: error_code(0x0002) - not-present page
[ 547.521089] PGD 100000067 P4D 100000067 PUD 0
[ 547.521133] Oops: 0002 [#1] SMP PTI
[ 547.521168] CPU: 1 PID: 1559 Comm: ip Not tainted 5.15.0-
[ 547.521233] Hardware name: Dell Inc. PowerEdge R310/05XKKK, BIOS 1.8.2 08/17/2011
[ 547.521293] RIP: 0010:veth_
[ 547.521342] Code: ff 41 89 9d 1c 01 00 00 49 21 85 e8 00 00 00 e9 74 ff ff ff 48 c7 c7 80 e3 b0 c0 e8 2b 3b 06 c1 b8 e4 ff ff ff 4d 85 ff 74 85 <49> c7 07 80 e3 b0 c0 e9 79 ff ff ff 48 c7 c7 20 e4 b0 c0 e8 09 3b
[ 547.521488] RSP: 0018:ffffb738c2
[ 547.521535] RAX: 00000000ffffffe4 RBX: 0000000000000db2 RCX: ffffb738c254fb20
[ 547.521594] RDX: ffffffffc0b0bf90 RSI: ffffb738c254f468 RDI: ffffffffc0b0e380
[ 547.521653] RBP: ffffb738c254f450 R08: 0000000000000001 R09: ffffb738c0081000
[ 547.521711] R10: 0000000000000000 R11: 0000000000000000 R12: ffff8c65ced90000
[ 547.521769] R13: ffff8c65c12f6000 R14: 0000000000000000 R15: ffffb73800000001
[ 547.521828] FS: 00007faa028b3b8
[ 547.521895] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 547.521943] CR2: ffffb73800000001 CR3: 000000010f068000 CR4: 00000000000006e0
[ 547.522004] Call Trace:
[ 547.522029] <TASK>
[ 547.522052] ? veth_open+0x90/0x90 [veth]
[ 547.522094] dev_xdp_
[ 547.522135] dev_xdp_
[ 547.522171] ? __bpf_prog_
[ 547.522212] dev_change_
[ 547.522252] do_setlink+
[ 547.522288] ? dev_get_
[ 547.522326] __rtnl_
[ 547.522363] ? security_
[ 547.522406] ? skb_queue_
[ 547.522444] ? sock_def_
[ 547.522485] ? __netlink_
[ 547.522528] ? netlink_
[ 547.522566] ? rtnl_getlink+
[ 547.522611] ? kmem_cache_
[ 547.522657] rtnl_newlink+
[ 547.522692] rtnetlink_
[ 547.522731] ? rtnl_calcit.
[ 547.524524] netlink_
[ 547.526314] rtnetlink_
[ 547.528102] netlink_
[ 547.529837] netlink_
[ 547.531505] sock_sendmsg+
[ 547.533114] ____sys_
[ 547.534667] ? import_
[ 547.536164] ? sendmsg_
[ 547.537609] ___sys_
[ 547.539024] ? rseq_ip_
[ 547.540420] ? __rseq_
[ 547.541824] ? exit_to_
[ 547.543227] ? exit_to_
[ 547.544623] ? syscall_
[ 547.545993] ? __x64_sys_
[ 547.547334] __sys_sendmsg+
[ 547.548650] __x64_sys_
[ 547.549918] do_syscall_
[ 547.551134] ? exc_page_
[ 547.552322] entry_SYSCALL_
[ 547.553511] RIP: 0033:0x7faa02a07b17
[ 547.554680] Code: 0f 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb b9 0f 1f 00 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 b8 2e 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 51 c3 48 83 ec 28 89 54 24 1c 48 89 74 24 10
[ 547.557206] RSP: 002b:00007ffdbb
[ 547.558517] RAX: ffffffffffffffda RBX: 0000000063f5ffbc RCX: 00007faa02a07b17
[ 547.559818] RDX: 0000000000000000 RSI: 00007ffdbbca36e0 RDI: 0000000000000003
[ 547.561110] RBP: 0000000000000000 R08: 0000000000000001 R09: 0000555ea5465830
[ 547.562386] R10: 00007faa02afa340 R11: 0000000000000246 R12: 0000000000000001
[ 547.563647] R13: 00007ffdbbca3790 R14: 0000000000000000 R15: 0000555ea4edb040
[ 547.564924] </TASK>
[ 547.566184] Modules linked in: algif_hash af_alg veth intel_powerclamp ipmi_ssif coretemp joydev input_leds binfmt_misc kvm_intel ipmi_si kvm dcdbas ipmi_devintf ipmi_msghandler intel_cstate mac_hid acpi_power_meter i7core_edac sch_fq_codel dm_multipath scsi_dh_rdac scsi_dh_emc scsi_dh_alua ramoops pstore_blk reed_solomon pstore_zone efi_pstore ip_tables x_tables autofs4 btrfs blake2b_generic zstd_compress raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 multipath linear mgag200 i2c_algo_bit hid_generic gpio_ich drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops mpt3sas cec rc_core usbhid raid_class drm pata_acpi hid lpc_ich bnx2 scsi_transport_sas
[ 547.575407] CR2: ffffb73800000001
[ 547.577070] ---[ end trace 3ebb9a2cada35096 ]---
[ 547.586349] RIP: 0010:veth_
[ 547.588039] Code: ff 41 89 9d 1c 01 00 00 49 21 85 e8 00 00 00 e9 74 ff ff ff 48 c7 c7 80 e3 b0 c0 e8 2b 3b 06 c1 b8 e4 ff ff ff 4d 85 ff 74 85 <49> c7 07 80 e3 b0 c0 e9 79 ff ff ff 48 c7 c7 20 e4 b0 c0 e8 09 3b
[ 547.591612] RSP: 0018:ffffb738c2
[ 547.593432] RAX: 00000000ffffffe4 RBX: 0000000000000db2 RCX: ffffb738c254fb20
[ 547.595282] RDX: ffffffffc0b0bf90 RSI: ffffb738c254f468 RDI: ffffffffc0b0e380
[ 547.597143] RBP: ffffb738c254f450 R08: 0000000000000001 R09: ffffb738c0081000
[ 547.599012] R10: 0000000000000000 R11: 0000000000000000 R12: ffff8c65ced90000
[ 547.600888] R13: ffff8c65c12f6000 R14: 0000000000000000 R15: ffffb73800000001
[ 547.602764] FS: 00007faa028b3b8
[ 547.604675] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 547.606593] CR2: ffffb73800000001 CR3: 000000010f068000 CR4: 00000000000006e0
As this node was not tested with this test in previous cycle, it's yet to determine whether this is a regression or not.