[Possible Regression] powerpc/eeh:eeh-basic.sh failed on P9 with 5.4.0-156-generic
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
ubuntu-kernel-tests |
New
|
Undecided
|
Unassigned |
Bug Description
Issue found on P9 node baltar and P8 node dryden with 5.4.0-156-generic.
This issue does not exist with 5.4.0-155-generic
Manually tested on P9 node baltar, with:
* 5.4.0-155-generic with 5.4.0-155-generic source - PASSED (2 out of 2 attempts)
* 5.4.0-155-generic with 5.4.0-156-generic source - PASSED (1 out of 1 attempt)
* 5.4.0-156-generic with 5.4.0-155-generic source - FAILED (2 out of 2 attempts)
* 5.4.0-156-generic with 5.4.0-156-generic source - FAILED (2 out of 2 attempts)
Test output:
0002:01:00.0, Failed to recover!
Breaking
....
0003:01:00.0, Failed to recover!
Breaking
0004:02:00.0, Recovered after 0 seconds
....
0005:01:00.0, Failed to recover!
3 devices failed to recover (4 tested)
dmesg output (trimmed to fit launchpad restriction, please find attachment for the original output):
[ 239.575350] i40e 0002:01:00.0: Going to break: [mem 0x6008008000000
[ 239.587625] i40e 0002:01:00.0: eeh_dev_
[ 239.587648] EEH: Recovering PHB#2-PE#0
[ 239.587668] EEH: PE location: Onboard LAN, PHB location: N/A
[ 239.587682] EEH: Frozen PHB#2-PE#0 detected
[ 239.587692] EEH: Call Trace:
[ 239.587710] EEH: [c00000000004cb6c] __eeh_send_
[ 239.587726] EEH: [c000000000045824] eeh_dev_
[ 239.587745] EEH: [c000000000045d30] eeh_dev_
[ 239.587761] EEH: [c00000000048cc68] __vfs_write+
[ 239.587775] EEH: [c00000000048e00c] vfs_write+
[ 239.587788] EEH: [c000000000490c94] ksys_write+
[ 239.587802] EEH: [c00000000000b378] system_
[ 239.587816] EEH: This PCI device has failed 1 times in the last hour and will be permanently disabled after 5000 failures.
[ 239.587817] EEH: Notify device drivers to shutdown
[ 239.587820] EEH: Beginning: 'error_detected(IO frozen)'
[ 239.587824] PCI 0002:01:00.0#0000: EEH: Invoking i40e->error_
[ 239.587827] i40e 0002:01:00.0: i40e_pci_
[ 239.588948] i40e 0002:01:00.0: VSI seid 396 Rx ring 0 dissable timeout
[ 239.589205] i40e 0002:01:00.0: VSI seid 396 Rx ring 1 dissable timeout
...
[ 239.618060] i40e 0002:01:00.0: VSI seid 396 Rx ring 117 dissable timeout
[ 239.618305] i40e 0002:01:00.0: VSI seid 396 Rx ring 118 dissable timeout
[ 239.678139] i40e 0002:01:00.0: VSI seid 396 Tx ring 0 disable timeout
[ 239.688707] i40e 0002:01:00.0: VSI seid 400 Rx ring 383 dissable timeout
[ 239.746150] i40e 0002:01:00.0: VSI seid 400 Tx ring 383 disable timeout
[ 239.746322] PCI 0002:01:00.0#0000: EEH: i40e driver reports: 'need reset'
[ 239.746325] PCI 0002:01:00.1#0000: EEH: Invoking i40e->error_
[ 239.746326] i40e 0002:01:00.1: i40e_pci_
[ 239.752006] i40e 0002:01:00.1: VSI seid 397 Rx ring 0 dissable timeout
[ 239.752243] i40e 0002:01:00.1: VSI seid 397 Rx ring 1 dissable timeout
[ 239.752479] i40e 0002:01:00.1: VSI seid 397 Rx ring 2 dissable timeout
...
[ 239.779652] i40e 0002:01:00.1: VSI seid 397 Rx ring 116 dissable timeout
[ 239.779889] i40e 0002:01:00.1: VSI seid 397 Rx ring 117 dissable timeout
[ 239.780126] i40e 0002:01:00.1: VSI seid 397 Rx ring 118 dissable timeout
[ 239.838150] i40e 0002:01:00.1: VSI seid 397 Tx ring 0 disable timeout
[ 239.848439] i40e 0002:01:00.1: VSI seid 401 Rx ring 383 dissable timeout
[ 239.850134] i40e 0002:01:00.3: ARQ: Unknown event 0x0000 ignored
[ 239.850139] i40e 0002:01:00.3: ARQ: Unknown event 0x0000 ignored
[ 239.850144] i40e 0002:01:00.3: ARQ: Unknown event 0x0000 ignored
...
[ 239.850762] i40e 0002:01:00.2: ARQ: Unknown event 0x0000 ignored
[ 239.850766] i40e 0002:01:00.2: ARQ: Unknown event 0x0000 ignored
[ 239.850770] i40e 0002:01:00.2: ARQ: Unknown event 0x0000 ignored
[ 239.906076] i40e 0002:01:00.1: VSI seid 401 Tx ring 383 disable timeout
[ 239.906192] PCI 0002:01:00.1#0000: EEH: i40e driver reports: 'need reset'
[ 239.906194] PCI 0002:01:00.2#0000: EEH: Invoking i40e->error_
[ 239.906196] i40e 0002:01:00.2: i40e_pci_
[ 239.907037] i40e 0002:01:00.2: VSI seid 398 Rx ring 0 dissable timeout
[ 239.907271] i40e 0002:01:00.2: VSI seid 398 Rx ring 1 dissable timeout
[ 239.907506] i40e 0002:01:00.2: VSI seid 398 Rx ring 2 dissable timeout
...
[ 239.933716] i40e 0002:01:00.2: VSI seid 398 Rx ring 114 dissable timeout
[ 239.933950] i40e 0002:01:00.2: VSI seid 398 Rx ring 115 dissable timeout
[ 239.934185] i40e 0002:01:00.2: VSI seid 398 Rx ring 116 dissable timeout
[ 239.934419] i40e 0002:01:00.2: VSI seid 398 Rx ring 117 dissable timeout
[ 239.934653] i40e 0002:01:00.2: VSI seid 398 Rx ring 118 dissable timeout
[ 239.994162] i40e 0002:01:00.2: VSI seid 398 Tx ring 0 disable timeout
[ 240.004456] i40e 0002:01:00.2: VSI seid 402 Rx ring 383 dissable timeout
[ 240.062532] i40e 0002:01:00.2: VSI seid 402 Tx ring 383 disable timeout
[ 240.062644] PCI 0002:01:00.2#0000: EEH: i40e driver reports: 'need reset'
[ 240.062647] PCI 0002:01:00.3#0000: EEH: Invoking i40e->error_
[ 240.062649] i40e 0002:01:00.3: i40e_pci_
[ 240.063943] i40e 0002:01:00.3: VSI seid 399 Rx ring 0 dissable timeout
[ 240.064178] i40e 0002:01:00.3: VSI seid 399 Rx ring 1 dissable timeout
[ 240.064412] i40e 0002:01:00.3: VSI seid 399 Rx ring 2 dissable timeout
...
[ 240.091741] i40e 0002:01:00.3: VSI seid 399 Rx ring 117 dissable timeout
[ 240.091979] i40e 0002:01:00.3: VSI seid 399 Rx ring 118 dissable timeout
[ 240.150216] i40e 0002:01:00.3: VSI seid 399 Tx ring 0 disable timeout
[ 240.160564] i40e 0002:01:00.3: VSI seid 403 Rx ring 383 dissable timeout
[ 240.218163] i40e 0002:01:00.3: VSI seid 403 Tx ring 383 disable timeout
[ 240.218272] PCI 0002:01:00.3#0000: EEH: i40e driver reports: 'need reset'
[ 240.218274] EEH: Finished:
[ 240.218277] EEH: Collect temporary log
[ 240.218342] EEH: of node=0002:01:00.0
[ 240.218344] EEH: PCI device/vendor: 15898086
[ 240.218346] EEH: PCI cmd/status register: 00100546
[ 240.218346] EEH: PCI-E capabilities and status follow:
[ 240.218355] EEH: PCI-E 00: 0002e010 10008ce4 00002150 00425083
[ 240.218361] EEH: PCI-E 10: 10830000 00000000 00000000 00000000
[ 240.218361] EEH: PCI-E 20: 00000000
[ 240.218362] EEH: PCI-E AER capability register set follows:
[ 240.218369] EEH: PCI-E AER 00: 14020001 00000000 00000000 00062030
[ 240.218375] EEH: PCI-E AER 10: 00000000 00002000 000001e0 00000000
[ 240.218380] EEH: PCI-E AER 20: 00000000 00000000 00000000 00000000
[ 240.218382] EEH: PCI-E AER 30: 00000000 00000000
[ 240.218384] EEH: of node=0002:01:00.1
[ 240.218385] EEH: PCI device/vendor: 15898086
[ 240.218387] EEH: PCI cmd/status register: 00100546
[ 240.218387] EEH: PCI-E capabilities and status follow:
[ 240.218395] EEH: PCI-E 00: 0002e010 10008ce4 00002150 00425083
[ 240.218401] EEH: PCI-E 10: 10830000 00000000 00000000 00000000
[ 240.218401] EEH: PCI-E 20: 00000000
[ 240.218402] EEH: PCI-E AER capability register set follows:
[ 240.218409] EEH: PCI-E AER 00: 14020001 00000000 00000000 00062030
[ 240.218415] EEH: PCI-E AER 10: 00000000 00002000 000001e0 00000000
[ 240.218420] EEH: PCI-E AER 20: 00000000 00000000 00000000 00000000
[ 240.218422] EEH: PCI-E AER 30: 00000000 00000000
[ 240.218424] EEH: of node=0002:01:00.2
[ 240.218425] EEH: PCI device/vendor: 15898086
[ 240.218427] EEH: PCI cmd/status register: 00100546
[ 240.218427] EEH: PCI-E capabilities and status follow:
[ 240.218435] EEH: PCI-E 00: 0002e010 10008ce4 00002150 00425083
[ 240.218441] EEH: PCI-E 10: 10830000 00000000 00000000 00000000
[ 240.218441] EEH: PCI-E 20: 00000000
[ 240.218442] EEH: PCI-E AER capability register set follows:
[ 240.218449] EEH: PCI-E AER 00: 14020001 00000000 00000000 00062030
[ 240.218454] EEH: PCI-E AER 10: 00000000 00002000 000001e0 00000000
[ 240.218460] EEH: PCI-E AER 20: 00000000 00000000 00000000 00000000
[ 240.218462] EEH: PCI-E AER 30: 00000000 00000000
[ 240.218463] EEH: of node=0002:01:00.3
[ 240.218465] EEH: PCI device/vendor: 15898086
[ 240.218467] EEH: PCI cmd/status register: 00100546
[ 240.218467] EEH: PCI-E capabilities and status follow:
[ 240.218475] EEH: PCI-E 00: 0002e010 10008ce4 00002150 00425083
[ 240.218480] EEH: PCI-E 10: 10830000 00000000 00000000 00000000
[ 240.218481] EEH: PCI-E 20: 00000000
[ 240.218482] EEH: PCI-E AER capability register set follows:
[ 240.218489] EEH: PCI-E AER 00: 14020001 00000000 00000000 00062030
[ 240.218495] EEH: PCI-E AER 10: 00000000 00002000 000001e0 00000000
[ 240.218500] EEH: PCI-E AER 20: 00000000 00000000 00000000 00000000
[ 240.218502] EEH: PCI-E AER 30: 00000000 00000000
[ 240.218503] PHB4 PHB#2 Diag-data (Version: 1)
[ 240.218504] brdgCtl: 00000002
[ 240.218505] RootSts: 00000040 00402000 a0830008 00100107 00002000
[ 240.218506] PhbSts: 0000001c00000000 0000001c00000000
[ 240.218507] Lem: 0000000100280000 0000000000000000 0000000100000000
[ 240.218508] PhbErr: 0000088000000000 0000008000000000 2148000098000240 a008400000000000
[ 240.218509] RxeArbErr: 4000200000000000 0000200000000000 00009fde30000000 0000000000000000
[ 240.218510] PblErr: 0000000001000000 0000000001000000 0000000000000000 0000000000000000
[ 240.218510] RegbErr: 0000004000000000 0000004000000000 8800003c00000000 0000000007011000
[ 240.218511] PE[000] A/B: a440002a01000000 8000000000000000
[ 240.218514] EEH: Reset without hotplug activity
[ 246.410542] i40e 0002:01:00.0: not ready 1023ms after bus reset; waiting
[ 247.466583] i40e 0002:01:00.0: not ready 2047ms after bus reset; waiting
[ 249.578691] i40e 0002:01:00.0: not ready 4095ms after bus reset; waiting
[ 253.930949] i40e 0002:01:00.0: not ready 8191ms after bus reset; waiting
[ 262.379518] i40e 0002:01:00.0: not ready 16383ms after bus reset; waiting
[ 279.532384] i40e 0002:01:00.0: not ready 32767ms after bus reset; waiting
[ 301.231575] aacraid 0003:01:00.0: Going to break: [mem 0x600c180000000
[ 301.245067] aacraid 0003:01:00.0: eeh_dev_
[ 314.349746] i40e 0002:01:00.0: not ready 65535ms after bus reset; giving up
[ 317.454022] i40e 0002:01:00.0: enabling device (0140 -> 0142)
[ 317.454029] i40e 0002:01:00.1: enabling device (0140 -> 0142)
[ 317.454035] i40e 0002:01:00.2: enabling device (0140 -> 0142)
[ 317.454040] i40e 0002:01:00.3: enabling device (0140 -> 0142)
[ 317.466091] EEH: Beginning: 'slot_reset'
[ 317.466100] PCI 0002:01:00.0#0000: EEH: Invoking i40e->slot_reset()
[ 317.466456] PCI 0002:01:00.0#0000: EEH: i40e driver reports: 'recovered'
[ 317.466459] PCI 0002:01:00.1#0000: EEH: Invoking i40e->slot_reset()
[ 317.466795] PCI 0002:01:00.1#0000: EEH: i40e driver reports: 'recovered'
[ 317.466797] PCI 0002:01:00.2#0000: EEH: Invoking i40e->slot_reset()
[ 317.467143] PCI 0002:01:00.2#0000: EEH: i40e driver reports: 'recovered'
[ 317.467145] PCI 0002:01:00.3#0000: EEH: Invoking i40e->slot_reset()
[ 317.467492] PCI 0002:01:00.3#0000: EEH: i40e driver reports: 'recovered'
[ 317.467494] EEH: Finished:
[ 317.467495] EEH: Notify device driver to resume
[ 317.467496] EEH: Beginning: 'resume'
[ 317.467498] PCI 0002:01:00.0#0000: EEH: Invoking i40e->resume()
[ 317.483355] i40e 0002:01:00.0 enP2p1s0f0: NIC Link is Down
[ 317.727000] PCI 0002:01:00.0#0000: EEH: i40e driver reports: 'none'
[ 317.727005] PCI 0002:01:00.1#0000: EEH: Invoking i40e->resume()
[ 317.986624] PCI 0002:01:00.1#0000: EEH: i40e driver reports: 'none'
[ 317.986629] PCI 0002:01:00.2#0000: EEH: Invoking i40e->resume()
[ 318.245813] PCI 0002:01:00.2#0000: EEH: i40e driver reports: 'none'
[ 318.245817] PCI 0002:01:00.3#0000: EEH: Invoking i40e->resume()
[ 318.505912] PCI 0002:01:00.3#0000: EEH: i40e driver reports: 'none'
[ 318.505915] EEH: Finished:'resume'
[ 318.505917] EEH: Recovery successful.
[ 318.505921] EEH: Recovering PHB#3-PE#1fd
[ 318.505940] EEH: PE location: Onboard SAS, PHB location: N/A
[ 318.505958] EEH: Frozen PHB#3-PE#1fd detected
[ 318.505965] EEH: Call Trace:
[ 318.505977] EEH: [c00000000004cb6c] __eeh_send_
[ 318.505996] EEH: [c000000000045824] eeh_dev_
[ 318.506016] EEH: [c000000000045d30] eeh_dev_
[ 318.506044] EEH: [c00000000048cc68] __vfs_write+
[ 318.506061] EEH: [c00000000048e00c] vfs_write+
[ 318.506079] EEH: [c000000000490c94] ksys_write+
[ 318.506097] EEH: [c00000000000b378] system_
[ 318.506114] EEH: This PCI device has failed 1 times in the last hour and will be permanently disabled after 5000 failures.
[ 318.506114] EEH: Notify device drivers to shutdown
[ 318.506116] EEH: Beginning: 'error_detected(IO frozen)'
[ 318.506118] PCI 0003:01:00.0#01fd: EEH: Invoking aacraid-
[ 318.506121] aacraid 0003:01:00.0: aacraid: PCI error detected 2
[ 318.506173] sd 0:2:0:0: [sda] tag#728 FAILED Result: hostbyte=
[ 318.506175] sd 0:2:0:0: [sda] tag#731 FAILED Result: hostbyte=
[ 318.506180] sd 0:2:0:0: [sda] tag#912 FAILED Result: hostbyte=
[ 318.506182] sd 0:2:0:0: [sda] tag#731 CDB: Write(10) 2a 00 00 01 bd 70 00 01 00 00
[ 318.506185] blk_update_request: I/O error, dev sda, sector 114032 op 0x1:(WRITE) flags 0x800 phys_seg 2 prio class 0
[ 318.506188] sd 0:2:0:0: [sda] tag#728 CDB: Synchronize Cache(10) 35 00 00 00 00 00 00 00 00 00
[ 318.506190] sd 0:2:0:0: [sda] tag#912 CDB: Write(10) 2a 00 69 80 99 00 00 02 00 00
[ 318.506195] blk_update_request: I/O error, dev sda, sector 1770035456 op 0x1:(WRITE) flags 0x4800 phys_seg 4 prio class 0
[ 318.506211] sd 0:2:0:0: [sda] tag#913 FAILED Result: hostbyte=
[ 318.506215] sd 0:2:0:0: [sda] tag#913 CDB: Write(10) 2a 00 69 80 9b 00 00 02 00 00
[ 318.506219] blk_update_request: I/O error, dev sda, sector 1770035968 op 0x1:(WRITE) flags 0x4800 phys_seg 4 prio class 0
[ 318.506226] sd 0:2:0:0: [sda] tag#914 FAILED Result: hostbyte=
[ 318.506232] EXT4-fs warning (device sda2): ext4_end_bio:311: I/O error 10 writing to inode 68944133 (offset 0 size 0 starting block 14286)
[ 318.506233] sd 0:2:0:0: [sda] tag#914 CDB: Write(10) 2a 00 69 80 9d 00 00 02 00 00
[ 318.506237] blk_update_request: I/O error, dev sda, sector 1770036480 op 0x1:(WRITE) flags 0x4800 phys_seg 4 prio class 0
[ 318.506244] sd 0:2:0:0: [sda] tag#915 FAILED Result: hostbyte=
[ 318.506247] sd 0:2:0:0: [sda] tag#915 CDB: Write(10) 2a 00 69 80 9f 00 00 02 00 00
[ 318.506251] blk_update_request: I/O error, dev sda, sector 1770036992 op 0x1:(WRITE) flags 0x4800 phys_seg 4 prio class 0
[ 318.506257] sd 0:2:0:0: [sda] tag#916 FAILED Result: hostbyte=
[ 318.506261] sd 0:2:0:0: [sda] tag#916 CDB: Write(10) 2a 00 69 80 a1 00 00 02 00 00
[ 318.506265] blk_update_request: I/O error, dev sda, sector 1770037504 op 0x1:(WRITE) flags 0x4800 phys_seg 4 prio class 0
[ 318.506278] blk_update_request: I/O error, dev sda, sector 0 op 0x1:(WRITE) flags 0x800 phys_seg 0 prio class 0
[ 318.506280] sd 0:2:0:0: [sda] tag#917 FAILED Result: hostbyte=
[ 318.506284] sd 0:2:0:0: [sda] tag#917 CDB: Write(10) 2a 00 69 80 a3 00 00 02 00 00
[ 318.506287] blk_update_request: I/O error, dev sda, sector 1770038016 op 0x1:(WRITE) flags 0x4800 phys_seg 4 prio class 0
[ 318.506294] sd 0:2:0:0: [sda] tag#918 FAILED Result: hostbyte=
[ 318.506298] sd 0:2:0:0: [sda] tag#918 CDB: Write(10) 2a 00 69 80 a5 00 00 02 00 00
[ 318.506301] blk_update_request: I/O error, dev sda, sector 1770038528 op 0x1:(WRITE) flags 0x4800 phys_seg 4 prio class 0
[ 318.506307] sd 0:2:0:0: [sda] tag#919 FAILED Result: hostbyte=
[ 318.506311] sd 0:2:0:0: [sda] tag#919 CDB: Write(10) 2a 00 69 80 a7 00 00 02 00 00
[ 318.506314] blk_update_request: I/O error, dev sda, sector 1770039040 op 0x1:(WRITE) flags 0x800 phys_seg 4 prio class 0
[ 318.506320] Buffer I/O error on device sda2, logical block 11950
[ 318.506639] Buffer I/O error on device sda2, logical block 11951
[ 318.506664] Buffer I/O error on device sda2, logical block 11952
[ 318.506701] Buffer I/O error on device sda2, logical block 11953
[ 318.506738] Buffer I/O error on device sda2, logical block 11954
[ 318.506773] Buffer I/O error on device sda2, logical block 11955
[ 318.506818] Buffer I/O error on device sda2, logical block 11956
[ 318.506854] Buffer I/O error on device sda2, logical block 11957
[ 318.506882] PCI 0003:01:00.0#01fd: EEH: aacraid driver reports: 'need reset'
[ 318.506885] EEH: Finished:
[ 318.506890] Buffer I/O error on device sda2, logical block 11958
[ 318.506896] Buffer I/O error on device sda2, logical block 11959
[ 318.506928] EEH: Collect temporary log
[ 318.506966] EEH: of node=0003:01:00.0
[ 318.506968] EXT4-fs warning (device sda2): ext4_end_bio:311: I/O error 10 writing to inode 68944133 (offset 0 size 0 starting block 14334)
[ 318.506971] EEH: PCI device/vendor: 028d9005
[ 318.506974] EEH: PCI cmd/status register: 00100146
[ 318.506975] EEH: PCI-E capabilities and status follow:
[ 318.506980] EXT4-fs warning (device sda2): ext4_end_bio:311: I/O error 10 writing to inode 68944133 (offset 0 size 0 starting block 14558)
[ 318.506984] EEH: PCI-E 00: 00020010 000081a2 00002950 00437083
[ 318.506989] EXT4-fs warning (device sda2): ext4_end_bio:311: I/O error 10 writing to inode 68944133 (offset 0 size 0 starting block 14766)
[ 318.506993] EEH: PCI-E 10: 10830000 00000000 00000000 00000000
[ 318.506994] EEH: PCI-E 20: 00000000
[ 318.506995] EEH: PCI-E AER capability register set follows:
[ 318.507000] EXT4-fs warning (device sda2): ext4_end_bio:311: I/O error 10 writing to inode 68944133 (offset 0 size 0 starting block 275839745)
[ 318.507004] EEH: PCI-E AER 00: 30020001 00000000 00400000 00462030
[ 318.507009] EXT4-fs warning (device sda2): ext4_end_bio:311: I/O error 10 writing to inode 68944133 (offset 0 size 0 starting block 13854)
[ 318.507012] EEH: PCI-E AER 10: 00000000 0000e000 000001e0 00000000
[ 318.507018] EXT4-fs warning (device sda2): ext4_end_bio:311: I/O error 10 writing to inode 68944133 (offset 0 size 0 starting block 14862)
[ 318.507019] EEH: PCI-E AER 20: 00000000 00000000 00000000 00000000
[ 318.507022] EEH: PCI-E AER 30: 00000000 00000000
[ 318.507025] PHB4 PHB#3 Diag-data (Version: 1)
[ 318.507026] brdgCtl: 00000002
[ 318.507027] RootSts: 00000040 00402000 a0830008 00100107 00002000
[ 318.507028] PhbSts: 0000001c00000000 0000001c00000000
[ 318.507029] Lem: 0000000100280000 0000000000000000 0000000100000000
[ 318.507031] PhbErr: 0000088000000000 0000008000000000 2148000098000240 a008400000000000
[ 318.507032] RxeArbErr: 4000200000000000 0000200000000000 00009fde30000000 0000000000000000
[ 318.507033] PblErr: 0000000001000000 0000000001000000 0000000000000000 0000000000000000
[ 318.507035] RegbErr: 0000004000000000 0000004000000000 8800003c00000000 0000000007011000
[ 318.507038] EXT4-fs warning (device sda2): ext4_end_bio:311: I/O error 10 writing to inode 68944133 (offset 0 size 0 starting block 14894)
[ 318.507045] PE[1fd] A/B: a440002a01000000 8000000000000000
[ 318.507047] EXT4-fs warning (device sda2): ext4_end_bio:311: I/O error 10 writing to inode 68944133 (offset 0 size 0 starting block 14942)
[ 318.507052] EEH: Reset without hotplug activity
[ 318.507056] EXT4-fs warning (device sda2): ext4_end_bio:311: I/O error 10 writing to inode 68944133 (offset 0 size 0 starting block 14398)
[ 321.290107] i40e 0002:01:00.0 enP2p1s0f0: NIC Link is Up, 1000 Mbps Full Duplex, Flow Control: None
[ 324.717574] aacraid 0003:01:00.0: not ready 1023ms after bus reset; waiting
[ 325.773607] aacraid 0003:01:00.0: not ready 2047ms after bus reset; waiting
[ 327.917677] aacraid 0003:01:00.0: not ready 4095ms after bus reset; waiting
[ 332.269835] aacraid 0003:01:00.0: not ready 8191ms after bus reset; waiting
[ 340.718133] aacraid 0003:01:00.0: not ready 16383ms after bus reset; waiting
[ 357.358654] aacraid 0003:01:00.0: not ready 32767ms after bus reset; waiting
[ 362.872870] ast 0004:02:00.0: Going to break: [mem 0x600c200000000
[ 362.872939] ast 0004:02:00.0: eeh_dev_
[ 362.886464] xhci_hcd 0005:01:00.0: Going to break: [mem 0x600c280000000
[ 362.899022] xhci_hcd 0005:01:00.0: eeh_dev_
[ 392.175709] aacraid 0003:01:00.0: not ready 65535ms after bus reset; giving up
[ 395.279850] aacraid 0003:01:00.0: enabling device (0140 -> 0142)
[ 395.296599] EEH: Beginning: 'slot_reset'
[ 395.296604] PCI 0003:01:00.0#01fd: EEH: Invoking aacraid-
[ 395.296606] aacraid 0003:01:00.0: aacraid: PCI error - slot reset
[ 395.296669] PCI 0003:01:00.0#01fd: EEH: aacraid driver reports: 'recovered'
[ 395.296670] EEH: Finished:
[ 395.296671] EEH: Notify device driver to resume
[ 395.296672] EEH: Beginning: 'resume'
[ 395.296673] PCI 0003:01:00.0#01fd: EEH: Invoking aacraid->resume()
[ 427.457355] aacraid 0003:01:00.0: aacraid: PCI error - resume
[ 427.457403] PCI 0003:01:00.0#01fd: EEH: aacraid driver reports: 'none'
[ 427.457404] EEH: Finished:'resume'
[ 427.457405] EEH: Recovery successful.
[ 427.457408] EEH: Recovering PHB#5-PE#fd
[ 427.457427] EEH: PE location: Onboard USB, PHB location: N/A
[ 427.457444] EEH: Frozen PHB#5-PE#fd detected
[ 427.457451] EEH: Call Trace:
[ 427.457461] EEH: [c00000000004cb6c] __eeh_send_
[ 427.457479] EEH: [c000000000045824] eeh_dev_
[ 427.457499] EEH: [c000000000045d30] eeh_dev_
[ 427.457517] EEH: [c00000000048cc68] __vfs_write+
[ 427.457534] EEH: [c00000000048e00c] vfs_write+
[ 427.457551] EEH: [c000000000490c94] ksys_write+
[ 427.457568] EEH: [c00000000000b378] system_
[ 427.457584] EEH: This PCI device has failed 1 times in the last hour and will be permanently disabled after 5000 failures.
[ 427.457584] EEH: Notify device drivers to shutdown
[ 427.457585] EEH: Beginning: 'error_detected(IO frozen)'
[ 427.457586] PCI 0005:01:00.0#00fd: EEH: driver not EEH aware
[ 427.457587] EEH: Finished:
[ 427.457589] EEH: Collect temporary log
[ 427.457609] EEH: of node=0005:01:00.0
[ 427.457611] EEH: PCI device/vendor: 8241104c
[ 427.457613] EEH: PCI cmd/status register: 00100546
[ 427.457613] EEH: PCI-E capabilities and status follow:
[ 427.457622] EEH: PCI-E 00: 0002c010 00008fc3 00002050 00075c12
[ 427.457628] EEH: PCI-E 10: 10120000 00000000 00000000 00000000
[ 427.457629] EEH: PCI-E 20: 00000000
[ 427.457630] EEH: PCI-E AER capability register set follows:
[ 427.457637] EEH: PCI-E AER 00: 15020001 00000000 00000000 00462030
[ 427.457644] EEH: PCI-E AER 10: 00000000 00002000 000001e0 00000000
[ 427.457650] EEH: PCI-E AER 20: 00000000 00000000 00000000 00000000
[ 427.457652] EEH: PCI-E AER 30: 00000000 00000000
[ 427.457653] PHB4 PHB#5 Diag-data (Version: 1)
[ 427.457654] brdgCtl: 00000002
[ 427.457655] RootSts: 00000040 00402000 a0120008 00100107 00002000
[ 427.457656] PhbSts: 0000001c00000000 0000001c00000000
[ 427.457657] Lem: 0000000100280000 0000000000000000 0000000100000000
[ 427.457657] PhbErr: 0000088000000000 0000008000000000 2148000098000240 a008400000000000
[ 427.457658] RxeArbErr: 4000200000000000 0000200000000000 00009fde30000000 0000000000000000
[ 427.457659] PblErr: 0000000001000000 0000000001000000 0000000000000000 0000000000000000
[ 427.457660] RegbErr: 0040004000000000 0000004000000000 8800003c00000000 0000000007011000
[ 427.457662] PE[0fd] A/B: a440002a01000000 8000000000000000
[ 427.457663] EEH: Reset with hotplug activity
[ 427.457674] xhci_hcd 0005:01:00.0: remove, state 4
[ 427.457678] usb usb2: USB disconnect, device number 1
[ 427.458554] xhci_hcd 0005:01:00.0: USB bus 2 deregistered
[ 427.458563] xhci_hcd 0005:01:00.0: xHCI host controller not responding, assume dead
[ 427.458588] xhci_hcd 0005:01:00.0: remove, state 1
[ 427.458591] usb usb1: USB disconnect, device number 1
[ 427.458592] usb 1-3: USB disconnect, device number 2
[ 427.458594] usb 1-3.1: USB disconnect, device number 4
[ 427.586371] usb 1-4: USB disconnect, device number 3
[ 427.587217] xhci_hcd 0005:01:00.0: Host halt failed, -19
[ 427.587220] xhci_hcd 0005:01:00.0: Host not accessible, reset failed.
[ 427.587619] xhci_hcd 0005:01:00.0: USB bus 1 deregistered
[ 427.587663] pci 0005:01:00.0: Removing from iommu group 3
[ 432.509515] EEH: Sleep 5s ahead of complete hotplug
[ 437.745472] pci 0005:01:00.0: [104c:8241] type 00 class 0x0c0330
[ 437.745505] pci 0005:01:00.0: reg 0x10: [mem 0x600c280000000
[ 437.745519] pci 0005:01:00.0: reg 0x18: [mem 0x600c280010000
[ 437.745558] pci 0005:01:00.0: BAR2 [mem size 0x00002000 64bit]: requesting alignment to 0x10000
[ 437.745626] pci 0005:01:00.0: supports D1 D2
[ 437.745628] pci 0005:01:00.0: PME# supported from D0 D1 D2 D3hot
[ 437.745774] pci 0005:01:00.0: can't claim BAR 2 [mem size 0x00002000 64bit]: no address assigned
[ 437.745782] pci 0005:01:00.0: BAR 0: assigned [mem 0x600c280000000
[ 437.745794] pci 0005:01:00.0: BAR 2: assigned [mem 0x600c280010000
[ 437.745809] pci 0005:01 : [PE# fd] Secondary bus 0x0000000000000001 associated with PE#fd
[ 437.745891] pci 0005:01 : [PE# fd] Setting up 32-bit TCE table at 0..80000000
[ 437.749313] pci 0005:01 : [PE# fd] Setting up window#0 0..7ffffffffff pg=10000
[ 437.749320] pci 0005:01 : [PE# fd] Enabling 64-bit DMA bypass
[ 437.749325] pci 0005:00:00.0: PCI bridge to [bus 01]
[ 437.749331] pci 0005:00:00.0: bridge window [mem 0x600c280000000
[ 437.749423] pci 0005:01:00.0: enabling device (0140 -> 0142)
[ 437.750025] xhci_hcd 0005:01:00.0: xHCI Host Controller
[ 437.750036] xhci_hcd 0005:01:00.0: new USB bus registered, assigned bus number 1
[ 437.750185] xhci_hcd 0005:01:00.0: hcc params 0x0270f06d hci version 0x96 quirks 0x0000000004000000
[ 437.750705] xhci_hcd 0005:01:00.0: xHCI Host Controller
[ 437.750710] xhci_hcd 0005:01:00.0: new USB bus registered, assigned bus number 2
[ 437.750714] xhci_hcd 0005:01:00.0: Host supports USB 3.0 SuperSpeed
[ 437.750768] usb usb1: New USB device found, idVendor=1d6b, idProduct=0002, bcdDevice= 5.04
[ 437.750771] usb usb1: New USB device strings: Mfr=3, Product=2, SerialNumber=1
[ 437.750773] usb usb1: Product: xHCI Host Controller
[ 437.750775] usb usb1: Manufacturer: Linux 5.4.0-156-generic xhci-hcd
[ 437.750776] usb usb1: SerialNumber: 0005:01:00.0
[ 437.751507] hub 1-0:1.0: USB hub found
[ 437.751522] hub 1-0:1.0: 4 ports detected
[ 437.751770] usb usb2: We don't know the algorithms for LPM for this host, disabling LPM.
[ 437.751805] usb usb2: New USB device found, idVendor=1d6b, idProduct=0003, bcdDevice= 5.04
[ 437.751807] usb usb2: New USB device strings: Mfr=3, Product=2, SerialNumber=1
[ 437.751809] usb usb2: Product: xHCI Host Controller
[ 437.751811] usb usb2: Manufacturer: Linux 5.4.0-156-generic xhci-hcd
[ 437.751812] usb usb2: SerialNumber: 0005:01:00.0
[ 437.752312] hub 2-0:1.0: USB hub found
[ 437.752324] hub 2-0:1.0: 4 ports detected
[ 437.752586] EEH: Notify device driver to resume
[ 437.752588] EEH: Beginning: 'resume'
[ 437.752590] PCI 0005:01:00.0#00fd: EEH: driver not EEH aware
[ 437.752592] EEH: Finished:'resume'
[ 437.752594] EEH: Recovery successful.
[ 438.004898] usb 1-3: new high-speed USB device number 2 using xhci_hcd
[ 438.155653] usb 1-3: New USB device found, idVendor=0557, idProduct=7000, bcdDevice= 0.00
[ 438.155656] usb 1-3: New USB device strings: Mfr=0, Product=0, SerialNumber=0
[ 438.168473] hub 1-3:1.0: USB hub found
[ 438.169159] hub 1-3:1.0: 4 ports detected
[ 438.300909] usb 1-4: new high-speed USB device number 3 using xhci_hcd
[ 438.453070] usb 1-4: New USB device found, idVendor=0451, idProduct=80ff, bcdDevice= 1.01
[ 438.453073] usb 1-4: New USB device strings: Mfr=0, Product=0, SerialNumber=0
[ 438.533380] usb 1-3.1: new low-speed USB device number 4 using xhci_hcd
[ 438.651162] usb 1-3.1: New USB device found, idVendor=0557, idProduct=2419, bcdDevice= 1.00
[ 438.651165] usb 1-3.1: New USB device strings: Mfr=0, Product=0, SerialNumber=0
[ 438.679538] input: HID 0557:2419 as /devices/
[ 438.738845] hid-generic 0003:0557:
[ 438.744869] input: HID 0557:2419 as /devices/
[ 438.746338] hid-generic 0003:0557:
When this happens, the system will be read-only and thus cause test interruption.
Here is the dmesg output with -155.
It seems that those EEH related failures can be found with -155 as well. But I am not sure why the test didn't fail.
I noticed that on the second attempt of 5.4.0-155-generic with 5.4.0-155-generic source, test has passed but the system still dropped into a read-only state.