2017-06-07 14:50:00 |
bugproxy |
bug |
|
|
added bug |
2017-06-07 14:50:03 |
bugproxy |
tags |
|
architecture-ppc64le bugnameltc-155269 severity-high targetmilestone-inin16043 |
|
2017-06-07 14:50:04 |
bugproxy |
attachment added |
|
Attach multipath log file that has several failed paths with no redundancy. https://bugs.launchpad.net/bugs/1696445/+attachment/4891256/+files/multipath.log.05312017 |
|
2017-06-07 14:50:52 |
bugproxy |
attachment added |
|
attach kern.log file https://bugs.launchpad.net/bugs/1696445/+attachment/4891257/+files/kern.log |
|
2017-06-07 14:51:24 |
bugproxy |
attachment added |
|
attach syslog file https://bugs.launchpad.net/bugs/1696445/+attachment/4891258/+files/syslog |
|
2017-06-07 14:51:27 |
bugproxy |
attachment added |
|
Attach dmesg log https://bugs.launchpad.net/bugs/1696445/+attachment/4891259/+files/dmesg |
|
2017-06-07 14:51:33 |
bugproxy |
attachment added |
|
Attach debug log file https://bugs.launchpad.net/bugs/1696445/+attachment/4891260/+files/multipath.v3.out |
|
2017-06-07 14:51:36 |
bugproxy |
attachment added |
|
Attach debug log file https://bugs.launchpad.net/bugs/1696445/+attachment/4891261/+files/multipath.v3.ll.out |
|
2017-06-07 14:51:41 |
bugproxy |
attachment added |
|
Attach debug log file https://bugs.launchpad.net/bugs/1696445/+attachment/4891262/+files/journalctl.multipathd.out |
|
2017-06-07 14:51:44 |
bugproxy |
attachment added |
|
attach dmesg.out https://bugs.launchpad.net/bugs/1696445/+attachment/4891263/+files/dmesg.out |
|
2017-06-07 14:51:48 |
bugproxy |
ubuntu: assignee |
|
Ubuntu on IBM Power Systems Bug Triage (ubuntu-power-triage) |
|
2017-06-07 14:51:51 |
bugproxy |
affects |
ubuntu |
linux (Ubuntu) |
|
2017-06-07 16:40:05 |
Mauricio Faria de Oliveira |
description |
==== State: Open by: nguyenp on 31 May 2017 15:46:14 ====
Product Name : OpenPOWER Firmware
Product Version : open-power-SMC-P8DTU-V2.00.GA2-20170126-prod
Product Extra : op-build-3782262
Product Extra : hostboot-7fdfb37
Product Extra : occ-e6e194f
Product Extra : skiboot-5.4.2
Product Extra : linux-4.4.24-openpower1-9641b3a
Product Extra : petitboot-v1.4.0-2f8598b
Product Extra : p8dtu-xml-9a8fee2
Cable configuration:
====================
On this P8-Briggs system, I have 2 Seagate Storages running with max configuration. There are 84 HDDs drives in each storage. So the total drives is 168 HDDs for both Seagate storages.
I connected 2 LSI 9300-8e SAS adapters to 2 Seagate storages with alternate cabling for redundancy. See a Figure on the connection below:
Note: Each Seagate storage has 2 I/O moudules connection in the back.
Both I/O modules from each Seagate does see the same set of HDDs
Cable connection:
SAS adapter #1: port1 -----> Seagate #1-A I/O module
port0 --------------------------------------> Seagate #2-B I/O module
SAS adapter #2: port1 ----> Seagate #2-A I/O module
port0 --------------------------------------> Seagate #1-B I/O module
Ubuntu 16.04.2:
===============
- Running with new kernel Ubuntu 4.8.0-520-generic #550~16.04.1+bz154734 from Mauricio Faria De Oliveira.
Problem Description:
====================
In this Briggs system, I'm running with new Ubuntu 4.8.0-520-generic #550~16.04.1+bz154734 that has fix for Multipath problem. Mauricio helped to patch the system with this kernel last week to fix the multipath io_setup failed problem in LTCBug154734.
This week, I went ahead and scaled up my test configuration to max configuration 2x5U84_Enclosures,_MaxCfg_168HDDs. This time, it hit a different issue. The issue is that some multipaths only have a single path and no redundancy. Others have multiple paths and redundancy.
== Comment: #13 - Paul Nguyen - 2017-06-01 15:19:58 ==
- I agreed with Mauricio that this problem is a timing problem.
- I re-ran the test and noticed that it took more than 50 minutes after system reboot to discover all disks and to build Multipaths correctly.
- So for it to take this long, it's going to be a problem.
- I have gathered all logs and attaching to the bug for Mauricio to look and confirm.
- If there is a workaround or fix for faster probe time then I will try it out.
- Below is more information I captured:
Checkpoint #1:
==============
- system reboot around 2pm (14:00)
Checkpoint # 2:
===============
- It took several minutes for first disk to be detected.
root@smb1p1:~# dmesg -T | grep 'sd 1[78]:' | grep 'Attached SCSI disk' | head
[Thu Jun 1 14:06:48 2017] sd 17:0:1:0: [sdb] Attached SCSI disk
[Thu Jun 1 14:06:51 2017] sd 17:0:2:0: [sdc] Attached SCSI disk
[Thu Jun 1 14:06:53 2017] sd 17:0:3:0: [sdd] Attached SCSI disk
[Thu Jun 1 14:06:57 2017] sd 17:0:4:0: [sde] Attached SCSI disk
[Thu Jun 1 14:07:00 2017] sd 17:0:5:0: [sdf] Attached SCSI disk
[Thu Jun 1 14:07:03 2017] sd 17:0:6:0: [sdg] Attached SCSI disk
[Thu Jun 1 14:07:05 2017] sd 17:0:7:0: [sdh] Attached SCSI disk
[Thu Jun 1 14:07:08 2017] sd 17:0:8:0: [sdi] Attached SCSI disk
[Thu Jun 1 14:07:11 2017] sd 17:0:9:0: [sdj] Attached SCSI disk
[Thu Jun 1 14:07:14 2017] sd 17:0:10:0: [sdk] Attached SCSI disk
root@smb1p1:~#
...
root@smb1p1:~# multipath -ll|grep dm |wc -l
103
root@smb1p1:~# dmesg -T | grep 'sd 1[78]:' | grep 'Attached SCSI disk' | tail
[Thu Jun 1 14:18:30 2017] sd 17:0:100:0: [sdcr] Attached SCSI disk
[Thu Jun 1 14:18:35 2017] sd 17:0:101:0: [sdcs] Attached SCSI disk
[Thu Jun 1 14:18:40 2017] sd 17:0:102:0: [sdct] Attached SCSI disk
[Thu Jun 1 14:18:44 2017] sd 17:0:103:0: [sdcu] Attached SCSI disk
[Thu Jun 1 14:18:54 2017] sd 17:0:105:0: [sdcv] Attached SCSI disk
[Thu Jun 1 14:18:59 2017] sd 17:0:106:0: [sdcw] Attached SCSI disk
[Thu Jun 1 14:19:04 2017] sd 17:0:107:0: [sdcx] Attached SCSI disk
[Thu Jun 1 14:19:09 2017] sd 17:0:108:0: [sdcy] Attached SCSI disk
[Thu Jun 1 14:19:14 2017] sd 17:0:109:0: [sdcz] Attached SCSI disk
[Thu Jun 1 14:19:19 2017] sd 17:0:110:0: [sdda] Attached SCSI disk
root@smb1p1:~#
...
root@smb1p1:~# multipath -ll|grep dm |wc -l
126
root@smb1p1:~# dmesg -T | grep 'sd 1[78]:' | grep 'Attached SCSI disk' | tail
[Thu Jun 1 14:20:23 2017] sd 17:0:123:0: [sddn] Attached SCSI disk
[Thu Jun 1 14:20:28 2017] sd 17:0:124:0: [sddo] Attached SCSI disk
[Thu Jun 1 14:20:33 2017] sd 17:0:125:0: [sddp] Attached SCSI disk
[Thu Jun 1 14:20:38 2017] sd 17:0:126:0: [sddq] Attached SCSI disk
[Thu Jun 1 14:20:44 2017] sd 17:0:127:0: [sddr] Attached SCSI disk
[Thu Jun 1 14:20:48 2017] sd 17:0:128:0: [sdds] Attached SCSI disk
[Thu Jun 1 14:20:54 2017] sd 17:0:129:0: [sddt] Attached SCSI disk
[Thu Jun 1 14:20:59 2017] sd 17:0:130:0: [sddu] Attached SCSI disk
[Thu Jun 1 14:21:04 2017] sd 17:0:131:0: [sddv] Attached SCSI disk
[Thu Jun 1 14:21:09 2017] sd 17:0:132:0: [sddw] Attached SCSI disk
root@smb1p1:~#
...
root@smb1p1:~# multipath -ll|grep dm |wc -l
142
root@smb1p1:~# dmesg -T | grep 'sd 1[78]:' | grep 'Attached SCSI disk' | tail
[Thu Jun 1 14:21:54 2017] sd 17:0:141:0: [sdee] Attached SCSI disk
[Thu Jun 1 14:21:58 2017] sd 17:0:142:0: [sdef] Attached SCSI disk
[Thu Jun 1 14:22:04 2017] sd 17:0:143:0: [sdeg] Attached SCSI disk
[Thu Jun 1 14:22:08 2017] sd 17:0:144:0: [sdeh] Attached SCSI disk
[Thu Jun 1 14:22:14 2017] sd 17:0:145:0: [sdei] Attached SCSI disk
[Thu Jun 1 14:22:18 2017] sd 17:0:146:0: [sdej] Attached SCSI disk
[Thu Jun 1 14:22:24 2017] sd 17:0:147:0: [sdek] Attached SCSI disk
[Thu Jun 1 14:22:29 2017] sd 17:0:148:0: [sdel] Attached SCSI disk
[Thu Jun 1 14:22:34 2017] sd 17:0:149:0: [sdem] Attached SCSI disk
[Thu Jun 1 14:22:39 2017] sd 17:0:150:0: [sden] Attached SCSI disk
root@smb1p1:~#
...
root@smb1p1:~# multipath -ll|grep dm |wc -l
163
root@smb1p1:~# dmesg -T | grep 'sd 1[78]:' | grep 'Attached SCSI disk' | tail
[Thu Jun 1 14:23:48 2017] sd 17:0:164:0: [sdfa] Attached SCSI disk
[Thu Jun 1 14:23:53 2017] sd 17:0:165:0: [sdfb] Attached SCSI disk
[Thu Jun 1 14:23:58 2017] sd 17:0:166:0: [sdfc] Attached SCSI disk
[Thu Jun 1 14:24:03 2017] sd 17:0:167:0: [sdfd] Attached SCSI disk
[Thu Jun 1 14:24:08 2017] sd 17:0:168:0: [sdfe] Attached SCSI disk
[Thu Jun 1 14:24:13 2017] sd 17:0:169:0: [sdff] Attached SCSI disk
[Thu Jun 1 14:24:19 2017] sd 17:0:170:0: [sdfg] Attached SCSI disk
[Thu Jun 1 14:24:23 2017] sd 17:0:171:0: [sdfh] Attached SCSI disk
[Thu Jun 1 14:24:28 2017] sd 17:0:172:0: [sdfi] Attached SCSI disk
[Thu Jun 1 14:24:33 2017] sd 17:0:173:0: [sdfj] Attached SCSI disk
...
root@smb1p1:~# dmesg -T | grep 'sd 1[78]:' | grep 'Attached SCSI disk' | tail
[Thu Jun 1 14:24:03 2017] sd 17:0:167:0: [sdfd] Attached SCSI disk
[Thu Jun 1 14:24:08 2017] sd 17:0:168:0: [sdfe] Attached SCSI disk
[Thu Jun 1 14:24:13 2017] sd 17:0:169:0: [sdff] Attached SCSI disk
[Thu Jun 1 14:24:19 2017] sd 17:0:170:0: [sdfg] Attached SCSI disk
[Thu Jun 1 14:24:23 2017] sd 17:0:171:0: [sdfh] Attached SCSI disk
[Thu Jun 1 14:24:28 2017] sd 17:0:172:0: [sdfi] Attached SCSI disk
[Thu Jun 1 14:24:33 2017] sd 17:0:173:0: [sdfj] Attached SCSI disk
[Thu Jun 1 14:24:38 2017] sd 17:0:174:0: [sdfk] Attached SCSI disk
[Thu Jun 1 14:24:43 2017] sd 17:0:175:0: [sdfl] Attached SCSI disk
[Thu Jun 1 14:24:48 2017] sd 17:0:176:0: [sdfm] Attached SCSI disk
root@smb1p1:~#
root@smb1p1:~# date
Thu Jun 1 14:27:03 CDT 2017
root@smb1p1:~# multipath -ll | grep -c 'sd[a-z]\+'
168
root@smb1p1:~#
Checkpoint #3:
=============
- After 34 minutes, multipath -ll command shows paths with single path and no redundancy.
root@smb1p1:~# multipath -ll > multipath.log.06012017.afterReboot
root@smb1p1:~# cat multipath.log.06012017.afterReboot |more
35000c50086a3ca97 dm-161 IBM-ESXS,ST10000NM0226 E
size=9.0T features='0' hwhandler='0' wp=rw
`-+- policy='round-robin 0' prio=1 status=active
`- 17:0:170:0 sdfg 130:32 active ready running
35000c50086bae8bf dm-144 IBM-ESXS,ST10000NM0226 E
size=9.0T features='0' hwhandler='0' wp=rw
`-+- policy='round-robin 0' prio=1 status=active
`- 17:0:152:0 sdep 129:16 active ready running
35000c50086baa42f dm-143 IBM-ESXS,ST10000NM0226 E
size=9.0T features='0' hwhandler='0' wp=rw
`-+- policy='round-robin 0' prio=1 status=active
`- 17:0:151:0 sdeo 129:0 active ready running
...
Check point #4:
===============
- After 43 minutes, multipath -ll command shows some paths with only single path and no redundancy and some path with multiple paths and redundancy.
root@smb1p1:~# date
Thu Jun 1 14:43:00 CDT 2017
root@smb1p1:~# multipath -ll | grep -c 'sd[a-z]\+'
252
root@smb1p1:~#
Checkpoint #5:
==============
- After 47 minutes, multipath -ll command still shows some paths with only single path and no redundancy.
root@smb1p1:~# dmesg -T | grep 'sd 1[78]:' | grep 'Attached SCSI disk' | head
[Thu Jun 1 14:06:48 2017] sd 17:0:1:0: [sdb] Attached SCSI disk
[Thu Jun 1 14:06:51 2017] sd 17:0:2:0: [sdc] Attached SCSI disk
[Thu Jun 1 14:06:53 2017] sd 17:0:3:0: [sdd] Attached SCSI disk
[Thu Jun 1 14:06:57 2017] sd 17:0:4:0: [sde] Attached SCSI disk
[Thu Jun 1 14:07:00 2017] sd 17:0:5:0: [sdf] Attached SCSI disk
[Thu Jun 1 14:07:03 2017] sd 17:0:6:0: [sdg] Attached SCSI disk
[Thu Jun 1 14:07:05 2017] sd 17:0:7:0: [sdh] Attached SCSI disk
[Thu Jun 1 14:07:08 2017] sd 17:0:8:0: [sdi] Attached SCSI disk
[Thu Jun 1 14:07:11 2017] sd 17:0:9:0: [sdj] Attached SCSI disk
[Thu Jun 1 14:07:14 2017] sd 17:0:10:0: [sdk] Attached SCSI disk
root@smb1p1:~# dmesg -T | grep 'sd 1[78]:' | grep 'Attached SCSI disk' | tail
[Thu Jun 1 14:46:15 2017] sd 18:0:112:0: [sdjo] Attached SCSI disk
[Thu Jun 1 14:46:20 2017] sd 18:0:113:0: [sdjp] Attached SCSI disk
[Thu Jun 1 14:46:25 2017] sd 18:0:114:0: [sdjq] Attached SCSI disk
[Thu Jun 1 14:46:31 2017] sd 18:0:115:0: [sdjr] Attached SCSI disk
[Thu Jun 1 14:46:36 2017] sd 18:0:116:0: [sdjs] Attached SCSI disk
[Thu Jun 1 14:46:41 2017] sd 18:0:117:0: [sdjt] Attached SCSI disk
[Thu Jun 1 14:46:46 2017] sd 18:0:118:0: [sdju] Attached SCSI disk
[Thu Jun 1 14:46:51 2017] sd 18:0:119:0: [sdjv] Attached SCSI disk
[Thu Jun 1 14:46:56 2017] sd 18:0:120:0: [sdjw] Attached SCSI disk
[Thu Jun 1 14:47:01 2017] sd 18:0:121:0: [sdjx] Attached SCSI disk
root@smb1p1:~#
root@smb1p1:~#
root@smb1p1:~# date
Thu Jun 1 14:47:20 CDT 2017
root@smb1p1:~# multipath -ll | grep -c 'sd[a-z]\+'
288
root@smb1p1:~#
Checkpoint #6:
==============
- After 51 minutes after system reboot, looks like all disk are discovered and the Multipath is correctly built.
root@smb1p1:~# multipath -ll | grep -c 'sd[a-z]\+'
336
root@smb1p1:~# date
Thu Jun 1 14:52:05 CDT 2017
root@smb1p1:~# dmesg -T | grep 'sd 1[78]:' | grep 'Attached SCSI disk' | tail
[Thu Jun 1 14:50:47 2017] sd 18:0:167:0: [sdlp] Attached SCSI disk
[Thu Jun 1 14:50:52 2017] sd 18:0:168:0: [sdlq] Attached SCSI disk
[Thu Jun 1 14:50:57 2017] sd 18:0:169:0: [sdlr] Attached SCSI disk
[Thu Jun 1 14:51:02 2017] sd 18:0:170:0: [sdls] Attached SCSI disk
[Thu Jun 1 14:51:07 2017] sd 18:0:171:0: [sdlt] Attached SCSI disk
[Thu Jun 1 14:51:13 2017] sd 18:0:172:0: [sdlu] Attached SCSI disk
[Thu Jun 1 14:51:17 2017] sd 18:0:173:0: [sdlv] Attached SCSI disk
[Thu Jun 1 14:51:22 2017] sd 18:0:174:0: [sdlw] Attached SCSI disk
[Thu Jun 1 14:51:27 2017] sd 18:0:175:0: [sdlx] Attached SCSI disk
[Thu Jun 1 14:51:33 2017] sd 18:0:176:0: [sdly] Attached SCSI disk
root@smb1p1:~#
== Comment: #24 - Mauricio Faria De Oliveira - 2017-06-06 11:42:59 ==
Hi Paul,
Per your logs, yes, it's the slowness with the SES driver.
I'll ask Canonical to pick it up for 16.10 and 17.04 so it makes into 16.04.2 and 16.04.3.
Thanks,
Mauricio
== Comment: #26 - Mauricio Faria De Oliveira <mauricfo@br.ibm.com> - 2017-06-06 12:06:32 ==
The patch applies cleanly in the master-next branch of ubuntu-zesty.git and ubuntu-yakkety.git.
Mirroring to Canonical to get a LP bug number, required in the submission process.
== Comment: #27 - Mauricio Faria De Oliveira <mauricfo@br.ibm.com> - 2017-06-06 12:07:58 ==
The commit is [1].
commit 75106523f39751390b5789b36ee1d213b3af1945
Author: Mauricio Faria de Oliveira <mauricfo@linux.vnet.ibm.com>
Date: Wed Apr 5 12:18:19 2017 -0300
scsi: ses: don't get power status of SES device slot on probe
[1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=75106523f39751390b5789b36ee1d213b3af1945 |
[Impact]
* The SES driver causes a long delay in disk discovery when
a large number of disks is present in the disk enclosure,
which increases with the number of disks attached.
* This delays the addition and visibility of the disk devices
to userspace, which among other things causes multipath not
to have multiple paths, actually, until the disk discovery
eventually/finally finishes.
* The fix significantly shortens the time taken by the SES
driver to handle disk discovery, causing no extra delays,
by removing a superfluous SCSI command sent to enclosure.
[Test Case]
* Load the module to access the enclosure and its disks; e.g.,
$ sudo modprobe mpt3sas
* Notice the interval between the discovery of each disk; e.g., dmesg
$ dmesg -T | grep 'Attached SCSI disk' | tail -n2
[Thu Jun 1 14:18:30 2017] sd 17:0:100:0: [sdcr] Attached SCSI disk
[Thu Jun 1 14:18:35 2017] sd 17:0:101:0: [sdcs] Attached SCSI disk
* The interval should be in the same second or so range with the fix.
[Regression Potential]
* The power status of the disks in the enclosure is no longer
checked during probe time. However, the patch demonstrates that
initial value was never used in any way. So, little regression
potential.
* Nonetheless, users of SES enclosures which verify the power status
of disks in the enclosure might _theoretically_ see a problem, iff
the fix has a problem (which has not been found yet).
[Other Info]
* None at this time.
==== State: Open by: nguyenp on 31 May 2017 15:46:14 ====
Product Name : OpenPOWER Firmware
Product Version : open-power-SMC-P8DTU-V2.00.GA2-20170126-prod
Product Extra : op-build-3782262
Product Extra : hostboot-7fdfb37
Product Extra : occ-e6e194f
Product Extra : skiboot-5.4.2
Product Extra : linux-4.4.24-openpower1-9641b3a
Product Extra : petitboot-v1.4.0-2f8598b
Product Extra : p8dtu-xml-9a8fee2
Cable configuration:
====================
On this P8-Briggs system, I have 2 Seagate Storages running with max configuration. There are 84 HDDs drives in each storage. So the total drives is 168 HDDs for both Seagate storages.
I connected 2 LSI 9300-8e SAS adapters to 2 Seagate storages with alternate cabling for redundancy. See a Figure on the connection below:
Note: Each Seagate storage has 2 I/O moudules connection in the back.
Both I/O modules from each Seagate does see the same set of HDDs
Cable connection:
SAS adapter #1: port1 -----> Seagate #1-A I/O module
port0 --------------------------------------> Seagate #2-B I/O module
SAS adapter #2: port1 ----> Seagate #2-A I/O module
port0 --------------------------------------> Seagate #1-B I/O module
Ubuntu 16.04.2:
===============
- Running with new kernel Ubuntu 4.8.0-520-generic #550~16.04.1+bz154734 from Mauricio Faria De Oliveira.
Problem Description:
====================
In this Briggs system, I'm running with new Ubuntu 4.8.0-520-generic #550~16.04.1+bz154734 that has fix for Multipath problem. Mauricio helped to patch the system with this kernel last week to fix the multipath io_setup failed problem in LTCBug154734.
This week, I went ahead and scaled up my test configuration to max configuration 2x5U84_Enclosures,_MaxCfg_168HDDs. This time, it hit a different issue. The issue is that some multipaths only have a single path and no redundancy. Others have multiple paths and redundancy.
== Comment: #13 - Paul Nguyen - 2017-06-01 15:19:58 ==
- I agreed with Mauricio that this problem is a timing problem.
- I re-ran the test and noticed that it took more than 50 minutes after system reboot to discover all disks and to build Multipaths correctly.
- So for it to take this long, it's going to be a problem.
- I have gathered all logs and attaching to the bug for Mauricio to look and confirm.
- If there is a workaround or fix for faster probe time then I will try it out.
- Below is more information I captured:
Checkpoint #1:
==============
- system reboot around 2pm (14:00)
Checkpoint # 2:
===============
- It took several minutes for first disk to be detected.
root@smb1p1:~# dmesg -T | grep 'sd 1[78]:' | grep 'Attached SCSI disk' | head
[Thu Jun 1 14:06:48 2017] sd 17:0:1:0: [sdb] Attached SCSI disk
[Thu Jun 1 14:06:51 2017] sd 17:0:2:0: [sdc] Attached SCSI disk
[Thu Jun 1 14:06:53 2017] sd 17:0:3:0: [sdd] Attached SCSI disk
[Thu Jun 1 14:06:57 2017] sd 17:0:4:0: [sde] Attached SCSI disk
[Thu Jun 1 14:07:00 2017] sd 17:0:5:0: [sdf] Attached SCSI disk
[Thu Jun 1 14:07:03 2017] sd 17:0:6:0: [sdg] Attached SCSI disk
[Thu Jun 1 14:07:05 2017] sd 17:0:7:0: [sdh] Attached SCSI disk
[Thu Jun 1 14:07:08 2017] sd 17:0:8:0: [sdi] Attached SCSI disk
[Thu Jun 1 14:07:11 2017] sd 17:0:9:0: [sdj] Attached SCSI disk
[Thu Jun 1 14:07:14 2017] sd 17:0:10:0: [sdk] Attached SCSI disk
root@smb1p1:~#
...
root@smb1p1:~# multipath -ll|grep dm |wc -l
103
root@smb1p1:~# dmesg -T | grep 'sd 1[78]:' | grep 'Attached SCSI disk' | tail
[Thu Jun 1 14:18:30 2017] sd 17:0:100:0: [sdcr] Attached SCSI disk
[Thu Jun 1 14:18:35 2017] sd 17:0:101:0: [sdcs] Attached SCSI disk
[Thu Jun 1 14:18:40 2017] sd 17:0:102:0: [sdct] Attached SCSI disk
[Thu Jun 1 14:18:44 2017] sd 17:0:103:0: [sdcu] Attached SCSI disk
[Thu Jun 1 14:18:54 2017] sd 17:0:105:0: [sdcv] Attached SCSI disk
[Thu Jun 1 14:18:59 2017] sd 17:0:106:0: [sdcw] Attached SCSI disk
[Thu Jun 1 14:19:04 2017] sd 17:0:107:0: [sdcx] Attached SCSI disk
[Thu Jun 1 14:19:09 2017] sd 17:0:108:0: [sdcy] Attached SCSI disk
[Thu Jun 1 14:19:14 2017] sd 17:0:109:0: [sdcz] Attached SCSI disk
[Thu Jun 1 14:19:19 2017] sd 17:0:110:0: [sdda] Attached SCSI disk
root@smb1p1:~#
...
root@smb1p1:~# multipath -ll|grep dm |wc -l
126
root@smb1p1:~# dmesg -T | grep 'sd 1[78]:' | grep 'Attached SCSI disk' | tail
[Thu Jun 1 14:20:23 2017] sd 17:0:123:0: [sddn] Attached SCSI disk
[Thu Jun 1 14:20:28 2017] sd 17:0:124:0: [sddo] Attached SCSI disk
[Thu Jun 1 14:20:33 2017] sd 17:0:125:0: [sddp] Attached SCSI disk
[Thu Jun 1 14:20:38 2017] sd 17:0:126:0: [sddq] Attached SCSI disk
[Thu Jun 1 14:20:44 2017] sd 17:0:127:0: [sddr] Attached SCSI disk
[Thu Jun 1 14:20:48 2017] sd 17:0:128:0: [sdds] Attached SCSI disk
[Thu Jun 1 14:20:54 2017] sd 17:0:129:0: [sddt] Attached SCSI disk
[Thu Jun 1 14:20:59 2017] sd 17:0:130:0: [sddu] Attached SCSI disk
[Thu Jun 1 14:21:04 2017] sd 17:0:131:0: [sddv] Attached SCSI disk
[Thu Jun 1 14:21:09 2017] sd 17:0:132:0: [sddw] Attached SCSI disk
root@smb1p1:~#
...
root@smb1p1:~# multipath -ll|grep dm |wc -l
142
root@smb1p1:~# dmesg -T | grep 'sd 1[78]:' | grep 'Attached SCSI disk' | tail
[Thu Jun 1 14:21:54 2017] sd 17:0:141:0: [sdee] Attached SCSI disk
[Thu Jun 1 14:21:58 2017] sd 17:0:142:0: [sdef] Attached SCSI disk
[Thu Jun 1 14:22:04 2017] sd 17:0:143:0: [sdeg] Attached SCSI disk
[Thu Jun 1 14:22:08 2017] sd 17:0:144:0: [sdeh] Attached SCSI disk
[Thu Jun 1 14:22:14 2017] sd 17:0:145:0: [sdei] Attached SCSI disk
[Thu Jun 1 14:22:18 2017] sd 17:0:146:0: [sdej] Attached SCSI disk
[Thu Jun 1 14:22:24 2017] sd 17:0:147:0: [sdek] Attached SCSI disk
[Thu Jun 1 14:22:29 2017] sd 17:0:148:0: [sdel] Attached SCSI disk
[Thu Jun 1 14:22:34 2017] sd 17:0:149:0: [sdem] Attached SCSI disk
[Thu Jun 1 14:22:39 2017] sd 17:0:150:0: [sden] Attached SCSI disk
root@smb1p1:~#
...
root@smb1p1:~# multipath -ll|grep dm |wc -l
163
root@smb1p1:~# dmesg -T | grep 'sd 1[78]:' | grep 'Attached SCSI disk' | tail
[Thu Jun 1 14:23:48 2017] sd 17:0:164:0: [sdfa] Attached SCSI disk
[Thu Jun 1 14:23:53 2017] sd 17:0:165:0: [sdfb] Attached SCSI disk
[Thu Jun 1 14:23:58 2017] sd 17:0:166:0: [sdfc] Attached SCSI disk
[Thu Jun 1 14:24:03 2017] sd 17:0:167:0: [sdfd] Attached SCSI disk
[Thu Jun 1 14:24:08 2017] sd 17:0:168:0: [sdfe] Attached SCSI disk
[Thu Jun 1 14:24:13 2017] sd 17:0:169:0: [sdff] Attached SCSI disk
[Thu Jun 1 14:24:19 2017] sd 17:0:170:0: [sdfg] Attached SCSI disk
[Thu Jun 1 14:24:23 2017] sd 17:0:171:0: [sdfh] Attached SCSI disk
[Thu Jun 1 14:24:28 2017] sd 17:0:172:0: [sdfi] Attached SCSI disk
[Thu Jun 1 14:24:33 2017] sd 17:0:173:0: [sdfj] Attached SCSI disk
...
root@smb1p1:~# dmesg -T | grep 'sd 1[78]:' | grep 'Attached SCSI disk' | tail
[Thu Jun 1 14:24:03 2017] sd 17:0:167:0: [sdfd] Attached SCSI disk
[Thu Jun 1 14:24:08 2017] sd 17:0:168:0: [sdfe] Attached SCSI disk
[Thu Jun 1 14:24:13 2017] sd 17:0:169:0: [sdff] Attached SCSI disk
[Thu Jun 1 14:24:19 2017] sd 17:0:170:0: [sdfg] Attached SCSI disk
[Thu Jun 1 14:24:23 2017] sd 17:0:171:0: [sdfh] Attached SCSI disk
[Thu Jun 1 14:24:28 2017] sd 17:0:172:0: [sdfi] Attached SCSI disk
[Thu Jun 1 14:24:33 2017] sd 17:0:173:0: [sdfj] Attached SCSI disk
[Thu Jun 1 14:24:38 2017] sd 17:0:174:0: [sdfk] Attached SCSI disk
[Thu Jun 1 14:24:43 2017] sd 17:0:175:0: [sdfl] Attached SCSI disk
[Thu Jun 1 14:24:48 2017] sd 17:0:176:0: [sdfm] Attached SCSI disk
root@smb1p1:~#
root@smb1p1:~# date
Thu Jun 1 14:27:03 CDT 2017
root@smb1p1:~# multipath -ll | grep -c 'sd[a-z]\+'
168
root@smb1p1:~#
Checkpoint #3:
=============
- After 34 minutes, multipath -ll command shows paths with single path and no redundancy.
root@smb1p1:~# multipath -ll > multipath.log.06012017.afterReboot
root@smb1p1:~# cat multipath.log.06012017.afterReboot |more
35000c50086a3ca97 dm-161 IBM-ESXS,ST10000NM0226 E
size=9.0T features='0' hwhandler='0' wp=rw
`-+- policy='round-robin 0' prio=1 status=active
`- 17:0:170:0 sdfg 130:32 active ready running
35000c50086bae8bf dm-144 IBM-ESXS,ST10000NM0226 E
size=9.0T features='0' hwhandler='0' wp=rw
`-+- policy='round-robin 0' prio=1 status=active
`- 17:0:152:0 sdep 129:16 active ready running
35000c50086baa42f dm-143 IBM-ESXS,ST10000NM0226 E
size=9.0T features='0' hwhandler='0' wp=rw
`-+- policy='round-robin 0' prio=1 status=active
`- 17:0:151:0 sdeo 129:0 active ready running
...
Check point #4:
===============
- After 43 minutes, multipath -ll command shows some paths with only single path and no redundancy and some path with multiple paths and redundancy.
root@smb1p1:~# date
Thu Jun 1 14:43:00 CDT 2017
root@smb1p1:~# multipath -ll | grep -c 'sd[a-z]\+'
252
root@smb1p1:~#
Checkpoint #5:
==============
- After 47 minutes, multipath -ll command still shows some paths with only single path and no redundancy.
root@smb1p1:~# dmesg -T | grep 'sd 1[78]:' | grep 'Attached SCSI disk' | head
[Thu Jun 1 14:06:48 2017] sd 17:0:1:0: [sdb] Attached SCSI disk
[Thu Jun 1 14:06:51 2017] sd 17:0:2:0: [sdc] Attached SCSI disk
[Thu Jun 1 14:06:53 2017] sd 17:0:3:0: [sdd] Attached SCSI disk
[Thu Jun 1 14:06:57 2017] sd 17:0:4:0: [sde] Attached SCSI disk
[Thu Jun 1 14:07:00 2017] sd 17:0:5:0: [sdf] Attached SCSI disk
[Thu Jun 1 14:07:03 2017] sd 17:0:6:0: [sdg] Attached SCSI disk
[Thu Jun 1 14:07:05 2017] sd 17:0:7:0: [sdh] Attached SCSI disk
[Thu Jun 1 14:07:08 2017] sd 17:0:8:0: [sdi] Attached SCSI disk
[Thu Jun 1 14:07:11 2017] sd 17:0:9:0: [sdj] Attached SCSI disk
[Thu Jun 1 14:07:14 2017] sd 17:0:10:0: [sdk] Attached SCSI disk
root@smb1p1:~# dmesg -T | grep 'sd 1[78]:' | grep 'Attached SCSI disk' | tail
[Thu Jun 1 14:46:15 2017] sd 18:0:112:0: [sdjo] Attached SCSI disk
[Thu Jun 1 14:46:20 2017] sd 18:0:113:0: [sdjp] Attached SCSI disk
[Thu Jun 1 14:46:25 2017] sd 18:0:114:0: [sdjq] Attached SCSI disk
[Thu Jun 1 14:46:31 2017] sd 18:0:115:0: [sdjr] Attached SCSI disk
[Thu Jun 1 14:46:36 2017] sd 18:0:116:0: [sdjs] Attached SCSI disk
[Thu Jun 1 14:46:41 2017] sd 18:0:117:0: [sdjt] Attached SCSI disk
[Thu Jun 1 14:46:46 2017] sd 18:0:118:0: [sdju] Attached SCSI disk
[Thu Jun 1 14:46:51 2017] sd 18:0:119:0: [sdjv] Attached SCSI disk
[Thu Jun 1 14:46:56 2017] sd 18:0:120:0: [sdjw] Attached SCSI disk
[Thu Jun 1 14:47:01 2017] sd 18:0:121:0: [sdjx] Attached SCSI disk
root@smb1p1:~#
root@smb1p1:~#
root@smb1p1:~# date
Thu Jun 1 14:47:20 CDT 2017
root@smb1p1:~# multipath -ll | grep -c 'sd[a-z]\+'
288
root@smb1p1:~#
Checkpoint #6:
==============
- After 51 minutes after system reboot, looks like all disk are discovered and the Multipath is correctly built.
root@smb1p1:~# multipath -ll | grep -c 'sd[a-z]\+'
336
root@smb1p1:~# date
Thu Jun 1 14:52:05 CDT 2017
root@smb1p1:~# dmesg -T | grep 'sd 1[78]:' | grep 'Attached SCSI disk' | tail
[Thu Jun 1 14:50:47 2017] sd 18:0:167:0: [sdlp] Attached SCSI disk
[Thu Jun 1 14:50:52 2017] sd 18:0:168:0: [sdlq] Attached SCSI disk
[Thu Jun 1 14:50:57 2017] sd 18:0:169:0: [sdlr] Attached SCSI disk
[Thu Jun 1 14:51:02 2017] sd 18:0:170:0: [sdls] Attached SCSI disk
[Thu Jun 1 14:51:07 2017] sd 18:0:171:0: [sdlt] Attached SCSI disk
[Thu Jun 1 14:51:13 2017] sd 18:0:172:0: [sdlu] Attached SCSI disk
[Thu Jun 1 14:51:17 2017] sd 18:0:173:0: [sdlv] Attached SCSI disk
[Thu Jun 1 14:51:22 2017] sd 18:0:174:0: [sdlw] Attached SCSI disk
[Thu Jun 1 14:51:27 2017] sd 18:0:175:0: [sdlx] Attached SCSI disk
[Thu Jun 1 14:51:33 2017] sd 18:0:176:0: [sdly] Attached SCSI disk
root@smb1p1:~#
== Comment: #24 - Mauricio Faria De Oliveira - 2017-06-06 11:42:59 ==
Hi Paul,
Per your logs, yes, it's the slowness with the SES driver.
I'll ask Canonical to pick it up for 16.10 and 17.04 so it makes into 16.04.2 and 16.04.3.
Thanks,
Mauricio
== Comment: #26 - Mauricio Faria De Oliveira <mauricfo@br.ibm.com> - 2017-06-06 12:06:32 ==
The patch applies cleanly in the master-next branch of ubuntu-zesty.git and ubuntu-yakkety.git.
Mirroring to Canonical to get a LP bug number, required in the submission process.
== Comment: #27 - Mauricio Faria De Oliveira <mauricfo@br.ibm.com> - 2017-06-06 12:07:58 ==
The commit is [1].
commit 75106523f39751390b5789b36ee1d213b3af1945
Author: Mauricio Faria de Oliveira <mauricfo@linux.vnet.ibm.com>
Date: Wed Apr 5 12:18:19 2017 -0300
scsi: ses: don't get power status of SES device slot on probe
[1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=75106523f39751390b5789b36ee1d213b3af1945 |
|
2017-06-07 18:15:45 |
Mauricio Faria de Oliveira |
description |
[Impact]
* The SES driver causes a long delay in disk discovery when
a large number of disks is present in the disk enclosure,
which increases with the number of disks attached.
* This delays the addition and visibility of the disk devices
to userspace, which among other things causes multipath not
to have multiple paths, actually, until the disk discovery
eventually/finally finishes.
* The fix significantly shortens the time taken by the SES
driver to handle disk discovery, causing no extra delays,
by removing a superfluous SCSI command sent to enclosure.
[Test Case]
* Load the module to access the enclosure and its disks; e.g.,
$ sudo modprobe mpt3sas
* Notice the interval between the discovery of each disk; e.g., dmesg
$ dmesg -T | grep 'Attached SCSI disk' | tail -n2
[Thu Jun 1 14:18:30 2017] sd 17:0:100:0: [sdcr] Attached SCSI disk
[Thu Jun 1 14:18:35 2017] sd 17:0:101:0: [sdcs] Attached SCSI disk
* The interval should be in the same second or so range with the fix.
[Regression Potential]
* The power status of the disks in the enclosure is no longer
checked during probe time. However, the patch demonstrates that
initial value was never used in any way. So, little regression
potential.
* Nonetheless, users of SES enclosures which verify the power status
of disks in the enclosure might _theoretically_ see a problem, iff
the fix has a problem (which has not been found yet).
[Other Info]
* None at this time.
==== State: Open by: nguyenp on 31 May 2017 15:46:14 ====
Product Name : OpenPOWER Firmware
Product Version : open-power-SMC-P8DTU-V2.00.GA2-20170126-prod
Product Extra : op-build-3782262
Product Extra : hostboot-7fdfb37
Product Extra : occ-e6e194f
Product Extra : skiboot-5.4.2
Product Extra : linux-4.4.24-openpower1-9641b3a
Product Extra : petitboot-v1.4.0-2f8598b
Product Extra : p8dtu-xml-9a8fee2
Cable configuration:
====================
On this P8-Briggs system, I have 2 Seagate Storages running with max configuration. There are 84 HDDs drives in each storage. So the total drives is 168 HDDs for both Seagate storages.
I connected 2 LSI 9300-8e SAS adapters to 2 Seagate storages with alternate cabling for redundancy. See a Figure on the connection below:
Note: Each Seagate storage has 2 I/O moudules connection in the back.
Both I/O modules from each Seagate does see the same set of HDDs
Cable connection:
SAS adapter #1: port1 -----> Seagate #1-A I/O module
port0 --------------------------------------> Seagate #2-B I/O module
SAS adapter #2: port1 ----> Seagate #2-A I/O module
port0 --------------------------------------> Seagate #1-B I/O module
Ubuntu 16.04.2:
===============
- Running with new kernel Ubuntu 4.8.0-520-generic #550~16.04.1+bz154734 from Mauricio Faria De Oliveira.
Problem Description:
====================
In this Briggs system, I'm running with new Ubuntu 4.8.0-520-generic #550~16.04.1+bz154734 that has fix for Multipath problem. Mauricio helped to patch the system with this kernel last week to fix the multipath io_setup failed problem in LTCBug154734.
This week, I went ahead and scaled up my test configuration to max configuration 2x5U84_Enclosures,_MaxCfg_168HDDs. This time, it hit a different issue. The issue is that some multipaths only have a single path and no redundancy. Others have multiple paths and redundancy.
== Comment: #13 - Paul Nguyen - 2017-06-01 15:19:58 ==
- I agreed with Mauricio that this problem is a timing problem.
- I re-ran the test and noticed that it took more than 50 minutes after system reboot to discover all disks and to build Multipaths correctly.
- So for it to take this long, it's going to be a problem.
- I have gathered all logs and attaching to the bug for Mauricio to look and confirm.
- If there is a workaround or fix for faster probe time then I will try it out.
- Below is more information I captured:
Checkpoint #1:
==============
- system reboot around 2pm (14:00)
Checkpoint # 2:
===============
- It took several minutes for first disk to be detected.
root@smb1p1:~# dmesg -T | grep 'sd 1[78]:' | grep 'Attached SCSI disk' | head
[Thu Jun 1 14:06:48 2017] sd 17:0:1:0: [sdb] Attached SCSI disk
[Thu Jun 1 14:06:51 2017] sd 17:0:2:0: [sdc] Attached SCSI disk
[Thu Jun 1 14:06:53 2017] sd 17:0:3:0: [sdd] Attached SCSI disk
[Thu Jun 1 14:06:57 2017] sd 17:0:4:0: [sde] Attached SCSI disk
[Thu Jun 1 14:07:00 2017] sd 17:0:5:0: [sdf] Attached SCSI disk
[Thu Jun 1 14:07:03 2017] sd 17:0:6:0: [sdg] Attached SCSI disk
[Thu Jun 1 14:07:05 2017] sd 17:0:7:0: [sdh] Attached SCSI disk
[Thu Jun 1 14:07:08 2017] sd 17:0:8:0: [sdi] Attached SCSI disk
[Thu Jun 1 14:07:11 2017] sd 17:0:9:0: [sdj] Attached SCSI disk
[Thu Jun 1 14:07:14 2017] sd 17:0:10:0: [sdk] Attached SCSI disk
root@smb1p1:~#
...
root@smb1p1:~# multipath -ll|grep dm |wc -l
103
root@smb1p1:~# dmesg -T | grep 'sd 1[78]:' | grep 'Attached SCSI disk' | tail
[Thu Jun 1 14:18:30 2017] sd 17:0:100:0: [sdcr] Attached SCSI disk
[Thu Jun 1 14:18:35 2017] sd 17:0:101:0: [sdcs] Attached SCSI disk
[Thu Jun 1 14:18:40 2017] sd 17:0:102:0: [sdct] Attached SCSI disk
[Thu Jun 1 14:18:44 2017] sd 17:0:103:0: [sdcu] Attached SCSI disk
[Thu Jun 1 14:18:54 2017] sd 17:0:105:0: [sdcv] Attached SCSI disk
[Thu Jun 1 14:18:59 2017] sd 17:0:106:0: [sdcw] Attached SCSI disk
[Thu Jun 1 14:19:04 2017] sd 17:0:107:0: [sdcx] Attached SCSI disk
[Thu Jun 1 14:19:09 2017] sd 17:0:108:0: [sdcy] Attached SCSI disk
[Thu Jun 1 14:19:14 2017] sd 17:0:109:0: [sdcz] Attached SCSI disk
[Thu Jun 1 14:19:19 2017] sd 17:0:110:0: [sdda] Attached SCSI disk
root@smb1p1:~#
...
root@smb1p1:~# multipath -ll|grep dm |wc -l
126
root@smb1p1:~# dmesg -T | grep 'sd 1[78]:' | grep 'Attached SCSI disk' | tail
[Thu Jun 1 14:20:23 2017] sd 17:0:123:0: [sddn] Attached SCSI disk
[Thu Jun 1 14:20:28 2017] sd 17:0:124:0: [sddo] Attached SCSI disk
[Thu Jun 1 14:20:33 2017] sd 17:0:125:0: [sddp] Attached SCSI disk
[Thu Jun 1 14:20:38 2017] sd 17:0:126:0: [sddq] Attached SCSI disk
[Thu Jun 1 14:20:44 2017] sd 17:0:127:0: [sddr] Attached SCSI disk
[Thu Jun 1 14:20:48 2017] sd 17:0:128:0: [sdds] Attached SCSI disk
[Thu Jun 1 14:20:54 2017] sd 17:0:129:0: [sddt] Attached SCSI disk
[Thu Jun 1 14:20:59 2017] sd 17:0:130:0: [sddu] Attached SCSI disk
[Thu Jun 1 14:21:04 2017] sd 17:0:131:0: [sddv] Attached SCSI disk
[Thu Jun 1 14:21:09 2017] sd 17:0:132:0: [sddw] Attached SCSI disk
root@smb1p1:~#
...
root@smb1p1:~# multipath -ll|grep dm |wc -l
142
root@smb1p1:~# dmesg -T | grep 'sd 1[78]:' | grep 'Attached SCSI disk' | tail
[Thu Jun 1 14:21:54 2017] sd 17:0:141:0: [sdee] Attached SCSI disk
[Thu Jun 1 14:21:58 2017] sd 17:0:142:0: [sdef] Attached SCSI disk
[Thu Jun 1 14:22:04 2017] sd 17:0:143:0: [sdeg] Attached SCSI disk
[Thu Jun 1 14:22:08 2017] sd 17:0:144:0: [sdeh] Attached SCSI disk
[Thu Jun 1 14:22:14 2017] sd 17:0:145:0: [sdei] Attached SCSI disk
[Thu Jun 1 14:22:18 2017] sd 17:0:146:0: [sdej] Attached SCSI disk
[Thu Jun 1 14:22:24 2017] sd 17:0:147:0: [sdek] Attached SCSI disk
[Thu Jun 1 14:22:29 2017] sd 17:0:148:0: [sdel] Attached SCSI disk
[Thu Jun 1 14:22:34 2017] sd 17:0:149:0: [sdem] Attached SCSI disk
[Thu Jun 1 14:22:39 2017] sd 17:0:150:0: [sden] Attached SCSI disk
root@smb1p1:~#
...
root@smb1p1:~# multipath -ll|grep dm |wc -l
163
root@smb1p1:~# dmesg -T | grep 'sd 1[78]:' | grep 'Attached SCSI disk' | tail
[Thu Jun 1 14:23:48 2017] sd 17:0:164:0: [sdfa] Attached SCSI disk
[Thu Jun 1 14:23:53 2017] sd 17:0:165:0: [sdfb] Attached SCSI disk
[Thu Jun 1 14:23:58 2017] sd 17:0:166:0: [sdfc] Attached SCSI disk
[Thu Jun 1 14:24:03 2017] sd 17:0:167:0: [sdfd] Attached SCSI disk
[Thu Jun 1 14:24:08 2017] sd 17:0:168:0: [sdfe] Attached SCSI disk
[Thu Jun 1 14:24:13 2017] sd 17:0:169:0: [sdff] Attached SCSI disk
[Thu Jun 1 14:24:19 2017] sd 17:0:170:0: [sdfg] Attached SCSI disk
[Thu Jun 1 14:24:23 2017] sd 17:0:171:0: [sdfh] Attached SCSI disk
[Thu Jun 1 14:24:28 2017] sd 17:0:172:0: [sdfi] Attached SCSI disk
[Thu Jun 1 14:24:33 2017] sd 17:0:173:0: [sdfj] Attached SCSI disk
...
root@smb1p1:~# dmesg -T | grep 'sd 1[78]:' | grep 'Attached SCSI disk' | tail
[Thu Jun 1 14:24:03 2017] sd 17:0:167:0: [sdfd] Attached SCSI disk
[Thu Jun 1 14:24:08 2017] sd 17:0:168:0: [sdfe] Attached SCSI disk
[Thu Jun 1 14:24:13 2017] sd 17:0:169:0: [sdff] Attached SCSI disk
[Thu Jun 1 14:24:19 2017] sd 17:0:170:0: [sdfg] Attached SCSI disk
[Thu Jun 1 14:24:23 2017] sd 17:0:171:0: [sdfh] Attached SCSI disk
[Thu Jun 1 14:24:28 2017] sd 17:0:172:0: [sdfi] Attached SCSI disk
[Thu Jun 1 14:24:33 2017] sd 17:0:173:0: [sdfj] Attached SCSI disk
[Thu Jun 1 14:24:38 2017] sd 17:0:174:0: [sdfk] Attached SCSI disk
[Thu Jun 1 14:24:43 2017] sd 17:0:175:0: [sdfl] Attached SCSI disk
[Thu Jun 1 14:24:48 2017] sd 17:0:176:0: [sdfm] Attached SCSI disk
root@smb1p1:~#
root@smb1p1:~# date
Thu Jun 1 14:27:03 CDT 2017
root@smb1p1:~# multipath -ll | grep -c 'sd[a-z]\+'
168
root@smb1p1:~#
Checkpoint #3:
=============
- After 34 minutes, multipath -ll command shows paths with single path and no redundancy.
root@smb1p1:~# multipath -ll > multipath.log.06012017.afterReboot
root@smb1p1:~# cat multipath.log.06012017.afterReboot |more
35000c50086a3ca97 dm-161 IBM-ESXS,ST10000NM0226 E
size=9.0T features='0' hwhandler='0' wp=rw
`-+- policy='round-robin 0' prio=1 status=active
`- 17:0:170:0 sdfg 130:32 active ready running
35000c50086bae8bf dm-144 IBM-ESXS,ST10000NM0226 E
size=9.0T features='0' hwhandler='0' wp=rw
`-+- policy='round-robin 0' prio=1 status=active
`- 17:0:152:0 sdep 129:16 active ready running
35000c50086baa42f dm-143 IBM-ESXS,ST10000NM0226 E
size=9.0T features='0' hwhandler='0' wp=rw
`-+- policy='round-robin 0' prio=1 status=active
`- 17:0:151:0 sdeo 129:0 active ready running
...
Check point #4:
===============
- After 43 minutes, multipath -ll command shows some paths with only single path and no redundancy and some path with multiple paths and redundancy.
root@smb1p1:~# date
Thu Jun 1 14:43:00 CDT 2017
root@smb1p1:~# multipath -ll | grep -c 'sd[a-z]\+'
252
root@smb1p1:~#
Checkpoint #5:
==============
- After 47 minutes, multipath -ll command still shows some paths with only single path and no redundancy.
root@smb1p1:~# dmesg -T | grep 'sd 1[78]:' | grep 'Attached SCSI disk' | head
[Thu Jun 1 14:06:48 2017] sd 17:0:1:0: [sdb] Attached SCSI disk
[Thu Jun 1 14:06:51 2017] sd 17:0:2:0: [sdc] Attached SCSI disk
[Thu Jun 1 14:06:53 2017] sd 17:0:3:0: [sdd] Attached SCSI disk
[Thu Jun 1 14:06:57 2017] sd 17:0:4:0: [sde] Attached SCSI disk
[Thu Jun 1 14:07:00 2017] sd 17:0:5:0: [sdf] Attached SCSI disk
[Thu Jun 1 14:07:03 2017] sd 17:0:6:0: [sdg] Attached SCSI disk
[Thu Jun 1 14:07:05 2017] sd 17:0:7:0: [sdh] Attached SCSI disk
[Thu Jun 1 14:07:08 2017] sd 17:0:8:0: [sdi] Attached SCSI disk
[Thu Jun 1 14:07:11 2017] sd 17:0:9:0: [sdj] Attached SCSI disk
[Thu Jun 1 14:07:14 2017] sd 17:0:10:0: [sdk] Attached SCSI disk
root@smb1p1:~# dmesg -T | grep 'sd 1[78]:' | grep 'Attached SCSI disk' | tail
[Thu Jun 1 14:46:15 2017] sd 18:0:112:0: [sdjo] Attached SCSI disk
[Thu Jun 1 14:46:20 2017] sd 18:0:113:0: [sdjp] Attached SCSI disk
[Thu Jun 1 14:46:25 2017] sd 18:0:114:0: [sdjq] Attached SCSI disk
[Thu Jun 1 14:46:31 2017] sd 18:0:115:0: [sdjr] Attached SCSI disk
[Thu Jun 1 14:46:36 2017] sd 18:0:116:0: [sdjs] Attached SCSI disk
[Thu Jun 1 14:46:41 2017] sd 18:0:117:0: [sdjt] Attached SCSI disk
[Thu Jun 1 14:46:46 2017] sd 18:0:118:0: [sdju] Attached SCSI disk
[Thu Jun 1 14:46:51 2017] sd 18:0:119:0: [sdjv] Attached SCSI disk
[Thu Jun 1 14:46:56 2017] sd 18:0:120:0: [sdjw] Attached SCSI disk
[Thu Jun 1 14:47:01 2017] sd 18:0:121:0: [sdjx] Attached SCSI disk
root@smb1p1:~#
root@smb1p1:~#
root@smb1p1:~# date
Thu Jun 1 14:47:20 CDT 2017
root@smb1p1:~# multipath -ll | grep -c 'sd[a-z]\+'
288
root@smb1p1:~#
Checkpoint #6:
==============
- After 51 minutes after system reboot, looks like all disk are discovered and the Multipath is correctly built.
root@smb1p1:~# multipath -ll | grep -c 'sd[a-z]\+'
336
root@smb1p1:~# date
Thu Jun 1 14:52:05 CDT 2017
root@smb1p1:~# dmesg -T | grep 'sd 1[78]:' | grep 'Attached SCSI disk' | tail
[Thu Jun 1 14:50:47 2017] sd 18:0:167:0: [sdlp] Attached SCSI disk
[Thu Jun 1 14:50:52 2017] sd 18:0:168:0: [sdlq] Attached SCSI disk
[Thu Jun 1 14:50:57 2017] sd 18:0:169:0: [sdlr] Attached SCSI disk
[Thu Jun 1 14:51:02 2017] sd 18:0:170:0: [sdls] Attached SCSI disk
[Thu Jun 1 14:51:07 2017] sd 18:0:171:0: [sdlt] Attached SCSI disk
[Thu Jun 1 14:51:13 2017] sd 18:0:172:0: [sdlu] Attached SCSI disk
[Thu Jun 1 14:51:17 2017] sd 18:0:173:0: [sdlv] Attached SCSI disk
[Thu Jun 1 14:51:22 2017] sd 18:0:174:0: [sdlw] Attached SCSI disk
[Thu Jun 1 14:51:27 2017] sd 18:0:175:0: [sdlx] Attached SCSI disk
[Thu Jun 1 14:51:33 2017] sd 18:0:176:0: [sdly] Attached SCSI disk
root@smb1p1:~#
== Comment: #24 - Mauricio Faria De Oliveira - 2017-06-06 11:42:59 ==
Hi Paul,
Per your logs, yes, it's the slowness with the SES driver.
I'll ask Canonical to pick it up for 16.10 and 17.04 so it makes into 16.04.2 and 16.04.3.
Thanks,
Mauricio
== Comment: #26 - Mauricio Faria De Oliveira <mauricfo@br.ibm.com> - 2017-06-06 12:06:32 ==
The patch applies cleanly in the master-next branch of ubuntu-zesty.git and ubuntu-yakkety.git.
Mirroring to Canonical to get a LP bug number, required in the submission process.
== Comment: #27 - Mauricio Faria De Oliveira <mauricfo@br.ibm.com> - 2017-06-06 12:07:58 ==
The commit is [1].
commit 75106523f39751390b5789b36ee1d213b3af1945
Author: Mauricio Faria de Oliveira <mauricfo@linux.vnet.ibm.com>
Date: Wed Apr 5 12:18:19 2017 -0300
scsi: ses: don't get power status of SES device slot on probe
[1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=75106523f39751390b5789b36ee1d213b3af1945 |
[Impact]
* The SES driver causes a long delay in disk discovery when
a large number of disks is present in the disk enclosure,
which increases with the number of disks attached.
* This delays the addition and visibility of the disk devices
to userspace, which among other things causes multipath not
to have multiple paths, actually, until the disk discovery
eventually/finally finishes.
* The fix significantly shortens the time taken by the SES
driver to handle disk discovery, causing no extra delays,
by removing a superfluous SCSI command sent to enclosure.
[Test Case]
* Load the module to access the enclosure and its disks; e.g.,
$ sudo modprobe mpt3sas
* Notice the interval between the discovery of each disk; e.g., dmesg
$ dmesg -T | grep 'Attached SCSI disk' | tail -n2
[Thu Jun 1 14:18:30 2017] sd 17:0:100:0: [sdcr] Attached SCSI disk
[Thu Jun 1 14:18:35 2017] sd 17:0:101:0: [sdcs] Attached SCSI disk
* The interval should be in the same second or so range with the fix.
$ dmesg -T | grep 'Attached SCSI disk' | tail -n2
[Wed Jun 7 13:11:59 2017] sd 18:0:176:0: [sdly] Attached SCSI disk
[Wed Jun 7 13:11:59 2017] sd 18:0:175:0: [sdlx] Attached SCSI disk
[Regression Potential]
* The power status of the disks in the enclosure is no longer
checked during probe time. However, the patch demonstrates that
initial value was never used in any way. So, little regression
potential.
* Nonetheless, users of SES enclosures which verify the power status
of disks in the enclosure might _theoretically_ see a problem, iff
the fix has a problem (which has not been found yet).
[Other Info]
* None at this time.
==== State: Open by: nguyenp on 31 May 2017 15:46:14 ====
Product Name : OpenPOWER Firmware
Product Version : open-power-SMC-P8DTU-V2.00.GA2-20170126-prod
Product Extra : op-build-3782262
Product Extra : hostboot-7fdfb37
Product Extra : occ-e6e194f
Product Extra : skiboot-5.4.2
Product Extra : linux-4.4.24-openpower1-9641b3a
Product Extra : petitboot-v1.4.0-2f8598b
Product Extra : p8dtu-xml-9a8fee2
Cable configuration:
====================
On this P8-Briggs system, I have 2 Seagate Storages running with max configuration. There are 84 HDDs drives in each storage. So the total drives is 168 HDDs for both Seagate storages.
I connected 2 LSI 9300-8e SAS adapters to 2 Seagate storages with alternate cabling for redundancy. See a Figure on the connection below:
Note: Each Seagate storage has 2 I/O moudules connection in the back.
Both I/O modules from each Seagate does see the same set of HDDs
Cable connection:
SAS adapter #1: port1 -----> Seagate #1-A I/O module
port0 --------------------------------------> Seagate #2-B I/O module
SAS adapter #2: port1 ----> Seagate #2-A I/O module
port0 --------------------------------------> Seagate #1-B I/O module
Ubuntu 16.04.2:
===============
- Running with new kernel Ubuntu 4.8.0-520-generic #550~16.04.1+bz154734 from Mauricio Faria De Oliveira.
Problem Description:
====================
In this Briggs system, I'm running with new Ubuntu 4.8.0-520-generic #550~16.04.1+bz154734 that has fix for Multipath problem. Mauricio helped to patch the system with this kernel last week to fix the multipath io_setup failed problem in LTCBug154734.
This week, I went ahead and scaled up my test configuration to max configuration 2x5U84_Enclosures,_MaxCfg_168HDDs. This time, it hit a different issue. The issue is that some multipaths only have a single path and no redundancy. Others have multiple paths and redundancy.
== Comment: #13 - Paul Nguyen - 2017-06-01 15:19:58 ==
- I agreed with Mauricio that this problem is a timing problem.
- I re-ran the test and noticed that it took more than 50 minutes after system reboot to discover all disks and to build Multipaths correctly.
- So for it to take this long, it's going to be a problem.
- I have gathered all logs and attaching to the bug for Mauricio to look and confirm.
- If there is a workaround or fix for faster probe time then I will try it out.
- Below is more information I captured:
Checkpoint #1:
==============
- system reboot around 2pm (14:00)
Checkpoint # 2:
===============
- It took several minutes for first disk to be detected.
root@smb1p1:~# dmesg -T | grep 'sd 1[78]:' | grep 'Attached SCSI disk' | head
[Thu Jun 1 14:06:48 2017] sd 17:0:1:0: [sdb] Attached SCSI disk
[Thu Jun 1 14:06:51 2017] sd 17:0:2:0: [sdc] Attached SCSI disk
[Thu Jun 1 14:06:53 2017] sd 17:0:3:0: [sdd] Attached SCSI disk
[Thu Jun 1 14:06:57 2017] sd 17:0:4:0: [sde] Attached SCSI disk
[Thu Jun 1 14:07:00 2017] sd 17:0:5:0: [sdf] Attached SCSI disk
[Thu Jun 1 14:07:03 2017] sd 17:0:6:0: [sdg] Attached SCSI disk
[Thu Jun 1 14:07:05 2017] sd 17:0:7:0: [sdh] Attached SCSI disk
[Thu Jun 1 14:07:08 2017] sd 17:0:8:0: [sdi] Attached SCSI disk
[Thu Jun 1 14:07:11 2017] sd 17:0:9:0: [sdj] Attached SCSI disk
[Thu Jun 1 14:07:14 2017] sd 17:0:10:0: [sdk] Attached SCSI disk
root@smb1p1:~#
...
root@smb1p1:~# multipath -ll|grep dm |wc -l
103
root@smb1p1:~# dmesg -T | grep 'sd 1[78]:' | grep 'Attached SCSI disk' | tail
[Thu Jun 1 14:18:30 2017] sd 17:0:100:0: [sdcr] Attached SCSI disk
[Thu Jun 1 14:18:35 2017] sd 17:0:101:0: [sdcs] Attached SCSI disk
[Thu Jun 1 14:18:40 2017] sd 17:0:102:0: [sdct] Attached SCSI disk
[Thu Jun 1 14:18:44 2017] sd 17:0:103:0: [sdcu] Attached SCSI disk
[Thu Jun 1 14:18:54 2017] sd 17:0:105:0: [sdcv] Attached SCSI disk
[Thu Jun 1 14:18:59 2017] sd 17:0:106:0: [sdcw] Attached SCSI disk
[Thu Jun 1 14:19:04 2017] sd 17:0:107:0: [sdcx] Attached SCSI disk
[Thu Jun 1 14:19:09 2017] sd 17:0:108:0: [sdcy] Attached SCSI disk
[Thu Jun 1 14:19:14 2017] sd 17:0:109:0: [sdcz] Attached SCSI disk
[Thu Jun 1 14:19:19 2017] sd 17:0:110:0: [sdda] Attached SCSI disk
root@smb1p1:~#
...
root@smb1p1:~# multipath -ll|grep dm |wc -l
126
root@smb1p1:~# dmesg -T | grep 'sd 1[78]:' | grep 'Attached SCSI disk' | tail
[Thu Jun 1 14:20:23 2017] sd 17:0:123:0: [sddn] Attached SCSI disk
[Thu Jun 1 14:20:28 2017] sd 17:0:124:0: [sddo] Attached SCSI disk
[Thu Jun 1 14:20:33 2017] sd 17:0:125:0: [sddp] Attached SCSI disk
[Thu Jun 1 14:20:38 2017] sd 17:0:126:0: [sddq] Attached SCSI disk
[Thu Jun 1 14:20:44 2017] sd 17:0:127:0: [sddr] Attached SCSI disk
[Thu Jun 1 14:20:48 2017] sd 17:0:128:0: [sdds] Attached SCSI disk
[Thu Jun 1 14:20:54 2017] sd 17:0:129:0: [sddt] Attached SCSI disk
[Thu Jun 1 14:20:59 2017] sd 17:0:130:0: [sddu] Attached SCSI disk
[Thu Jun 1 14:21:04 2017] sd 17:0:131:0: [sddv] Attached SCSI disk
[Thu Jun 1 14:21:09 2017] sd 17:0:132:0: [sddw] Attached SCSI disk
root@smb1p1:~#
...
root@smb1p1:~# multipath -ll|grep dm |wc -l
142
root@smb1p1:~# dmesg -T | grep 'sd 1[78]:' | grep 'Attached SCSI disk' | tail
[Thu Jun 1 14:21:54 2017] sd 17:0:141:0: [sdee] Attached SCSI disk
[Thu Jun 1 14:21:58 2017] sd 17:0:142:0: [sdef] Attached SCSI disk
[Thu Jun 1 14:22:04 2017] sd 17:0:143:0: [sdeg] Attached SCSI disk
[Thu Jun 1 14:22:08 2017] sd 17:0:144:0: [sdeh] Attached SCSI disk
[Thu Jun 1 14:22:14 2017] sd 17:0:145:0: [sdei] Attached SCSI disk
[Thu Jun 1 14:22:18 2017] sd 17:0:146:0: [sdej] Attached SCSI disk
[Thu Jun 1 14:22:24 2017] sd 17:0:147:0: [sdek] Attached SCSI disk
[Thu Jun 1 14:22:29 2017] sd 17:0:148:0: [sdel] Attached SCSI disk
[Thu Jun 1 14:22:34 2017] sd 17:0:149:0: [sdem] Attached SCSI disk
[Thu Jun 1 14:22:39 2017] sd 17:0:150:0: [sden] Attached SCSI disk
root@smb1p1:~#
...
root@smb1p1:~# multipath -ll|grep dm |wc -l
163
root@smb1p1:~# dmesg -T | grep 'sd 1[78]:' | grep 'Attached SCSI disk' | tail
[Thu Jun 1 14:23:48 2017] sd 17:0:164:0: [sdfa] Attached SCSI disk
[Thu Jun 1 14:23:53 2017] sd 17:0:165:0: [sdfb] Attached SCSI disk
[Thu Jun 1 14:23:58 2017] sd 17:0:166:0: [sdfc] Attached SCSI disk
[Thu Jun 1 14:24:03 2017] sd 17:0:167:0: [sdfd] Attached SCSI disk
[Thu Jun 1 14:24:08 2017] sd 17:0:168:0: [sdfe] Attached SCSI disk
[Thu Jun 1 14:24:13 2017] sd 17:0:169:0: [sdff] Attached SCSI disk
[Thu Jun 1 14:24:19 2017] sd 17:0:170:0: [sdfg] Attached SCSI disk
[Thu Jun 1 14:24:23 2017] sd 17:0:171:0: [sdfh] Attached SCSI disk
[Thu Jun 1 14:24:28 2017] sd 17:0:172:0: [sdfi] Attached SCSI disk
[Thu Jun 1 14:24:33 2017] sd 17:0:173:0: [sdfj] Attached SCSI disk
...
root@smb1p1:~# dmesg -T | grep 'sd 1[78]:' | grep 'Attached SCSI disk' | tail
[Thu Jun 1 14:24:03 2017] sd 17:0:167:0: [sdfd] Attached SCSI disk
[Thu Jun 1 14:24:08 2017] sd 17:0:168:0: [sdfe] Attached SCSI disk
[Thu Jun 1 14:24:13 2017] sd 17:0:169:0: [sdff] Attached SCSI disk
[Thu Jun 1 14:24:19 2017] sd 17:0:170:0: [sdfg] Attached SCSI disk
[Thu Jun 1 14:24:23 2017] sd 17:0:171:0: [sdfh] Attached SCSI disk
[Thu Jun 1 14:24:28 2017] sd 17:0:172:0: [sdfi] Attached SCSI disk
[Thu Jun 1 14:24:33 2017] sd 17:0:173:0: [sdfj] Attached SCSI disk
[Thu Jun 1 14:24:38 2017] sd 17:0:174:0: [sdfk] Attached SCSI disk
[Thu Jun 1 14:24:43 2017] sd 17:0:175:0: [sdfl] Attached SCSI disk
[Thu Jun 1 14:24:48 2017] sd 17:0:176:0: [sdfm] Attached SCSI disk
root@smb1p1:~#
root@smb1p1:~# date
Thu Jun 1 14:27:03 CDT 2017
root@smb1p1:~# multipath -ll | grep -c 'sd[a-z]\+'
168
root@smb1p1:~#
Checkpoint #3:
=============
- After 34 minutes, multipath -ll command shows paths with single path and no redundancy.
root@smb1p1:~# multipath -ll > multipath.log.06012017.afterReboot
root@smb1p1:~# cat multipath.log.06012017.afterReboot |more
35000c50086a3ca97 dm-161 IBM-ESXS,ST10000NM0226 E
size=9.0T features='0' hwhandler='0' wp=rw
`-+- policy='round-robin 0' prio=1 status=active
`- 17:0:170:0 sdfg 130:32 active ready running
35000c50086bae8bf dm-144 IBM-ESXS,ST10000NM0226 E
size=9.0T features='0' hwhandler='0' wp=rw
`-+- policy='round-robin 0' prio=1 status=active
`- 17:0:152:0 sdep 129:16 active ready running
35000c50086baa42f dm-143 IBM-ESXS,ST10000NM0226 E
size=9.0T features='0' hwhandler='0' wp=rw
`-+- policy='round-robin 0' prio=1 status=active
`- 17:0:151:0 sdeo 129:0 active ready running
...
Check point #4:
===============
- After 43 minutes, multipath -ll command shows some paths with only single path and no redundancy and some path with multiple paths and redundancy.
root@smb1p1:~# date
Thu Jun 1 14:43:00 CDT 2017
root@smb1p1:~# multipath -ll | grep -c 'sd[a-z]\+'
252
root@smb1p1:~#
Checkpoint #5:
==============
- After 47 minutes, multipath -ll command still shows some paths with only single path and no redundancy.
root@smb1p1:~# dmesg -T | grep 'sd 1[78]:' | grep 'Attached SCSI disk' | head
[Thu Jun 1 14:06:48 2017] sd 17:0:1:0: [sdb] Attached SCSI disk
[Thu Jun 1 14:06:51 2017] sd 17:0:2:0: [sdc] Attached SCSI disk
[Thu Jun 1 14:06:53 2017] sd 17:0:3:0: [sdd] Attached SCSI disk
[Thu Jun 1 14:06:57 2017] sd 17:0:4:0: [sde] Attached SCSI disk
[Thu Jun 1 14:07:00 2017] sd 17:0:5:0: [sdf] Attached SCSI disk
[Thu Jun 1 14:07:03 2017] sd 17:0:6:0: [sdg] Attached SCSI disk
[Thu Jun 1 14:07:05 2017] sd 17:0:7:0: [sdh] Attached SCSI disk
[Thu Jun 1 14:07:08 2017] sd 17:0:8:0: [sdi] Attached SCSI disk
[Thu Jun 1 14:07:11 2017] sd 17:0:9:0: [sdj] Attached SCSI disk
[Thu Jun 1 14:07:14 2017] sd 17:0:10:0: [sdk] Attached SCSI disk
root@smb1p1:~# dmesg -T | grep 'sd 1[78]:' | grep 'Attached SCSI disk' | tail
[Thu Jun 1 14:46:15 2017] sd 18:0:112:0: [sdjo] Attached SCSI disk
[Thu Jun 1 14:46:20 2017] sd 18:0:113:0: [sdjp] Attached SCSI disk
[Thu Jun 1 14:46:25 2017] sd 18:0:114:0: [sdjq] Attached SCSI disk
[Thu Jun 1 14:46:31 2017] sd 18:0:115:0: [sdjr] Attached SCSI disk
[Thu Jun 1 14:46:36 2017] sd 18:0:116:0: [sdjs] Attached SCSI disk
[Thu Jun 1 14:46:41 2017] sd 18:0:117:0: [sdjt] Attached SCSI disk
[Thu Jun 1 14:46:46 2017] sd 18:0:118:0: [sdju] Attached SCSI disk
[Thu Jun 1 14:46:51 2017] sd 18:0:119:0: [sdjv] Attached SCSI disk
[Thu Jun 1 14:46:56 2017] sd 18:0:120:0: [sdjw] Attached SCSI disk
[Thu Jun 1 14:47:01 2017] sd 18:0:121:0: [sdjx] Attached SCSI disk
root@smb1p1:~#
root@smb1p1:~#
root@smb1p1:~# date
Thu Jun 1 14:47:20 CDT 2017
root@smb1p1:~# multipath -ll | grep -c 'sd[a-z]\+'
288
root@smb1p1:~#
Checkpoint #6:
==============
- After 51 minutes after system reboot, looks like all disk are discovered and the Multipath is correctly built.
root@smb1p1:~# multipath -ll | grep -c 'sd[a-z]\+'
336
root@smb1p1:~# date
Thu Jun 1 14:52:05 CDT 2017
root@smb1p1:~# dmesg -T | grep 'sd 1[78]:' | grep 'Attached SCSI disk' | tail
[Thu Jun 1 14:50:47 2017] sd 18:0:167:0: [sdlp] Attached SCSI disk
[Thu Jun 1 14:50:52 2017] sd 18:0:168:0: [sdlq] Attached SCSI disk
[Thu Jun 1 14:50:57 2017] sd 18:0:169:0: [sdlr] Attached SCSI disk
[Thu Jun 1 14:51:02 2017] sd 18:0:170:0: [sdls] Attached SCSI disk
[Thu Jun 1 14:51:07 2017] sd 18:0:171:0: [sdlt] Attached SCSI disk
[Thu Jun 1 14:51:13 2017] sd 18:0:172:0: [sdlu] Attached SCSI disk
[Thu Jun 1 14:51:17 2017] sd 18:0:173:0: [sdlv] Attached SCSI disk
[Thu Jun 1 14:51:22 2017] sd 18:0:174:0: [sdlw] Attached SCSI disk
[Thu Jun 1 14:51:27 2017] sd 18:0:175:0: [sdlx] Attached SCSI disk
[Thu Jun 1 14:51:33 2017] sd 18:0:176:0: [sdly] Attached SCSI disk
root@smb1p1:~#
== Comment: #24 - Mauricio Faria De Oliveira - 2017-06-06 11:42:59 ==
Hi Paul,
Per your logs, yes, it's the slowness with the SES driver.
I'll ask Canonical to pick it up for 16.10 and 17.04 so it makes into 16.04.2 and 16.04.3.
Thanks,
Mauricio
== Comment: #26 - Mauricio Faria De Oliveira <mauricfo@br.ibm.com> - 2017-06-06 12:06:32 ==
The patch applies cleanly in the master-next branch of ubuntu-zesty.git and ubuntu-yakkety.git.
Mirroring to Canonical to get a LP bug number, required in the submission process.
== Comment: #27 - Mauricio Faria De Oliveira <mauricfo@br.ibm.com> - 2017-06-06 12:07:58 ==
The commit is [1].
commit 75106523f39751390b5789b36ee1d213b3af1945
Author: Mauricio Faria de Oliveira <mauricfo@linux.vnet.ibm.com>
Date: Wed Apr 5 12:18:19 2017 -0300
scsi: ses: don't get power status of SES device slot on probe
[1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=75106523f39751390b5789b36ee1d213b3af1945 |
|
2017-06-07 19:03:13 |
Mauricio Faria de Oliveira |
description |
[Impact]
* The SES driver causes a long delay in disk discovery when
a large number of disks is present in the disk enclosure,
which increases with the number of disks attached.
* This delays the addition and visibility of the disk devices
to userspace, which among other things causes multipath not
to have multiple paths, actually, until the disk discovery
eventually/finally finishes.
* The fix significantly shortens the time taken by the SES
driver to handle disk discovery, causing no extra delays,
by removing a superfluous SCSI command sent to enclosure.
[Test Case]
* Load the module to access the enclosure and its disks; e.g.,
$ sudo modprobe mpt3sas
* Notice the interval between the discovery of each disk; e.g., dmesg
$ dmesg -T | grep 'Attached SCSI disk' | tail -n2
[Thu Jun 1 14:18:30 2017] sd 17:0:100:0: [sdcr] Attached SCSI disk
[Thu Jun 1 14:18:35 2017] sd 17:0:101:0: [sdcs] Attached SCSI disk
* The interval should be in the same second or so range with the fix.
$ dmesg -T | grep 'Attached SCSI disk' | tail -n2
[Wed Jun 7 13:11:59 2017] sd 18:0:176:0: [sdly] Attached SCSI disk
[Wed Jun 7 13:11:59 2017] sd 18:0:175:0: [sdlx] Attached SCSI disk
[Regression Potential]
* The power status of the disks in the enclosure is no longer
checked during probe time. However, the patch demonstrates that
initial value was never used in any way. So, little regression
potential.
* Nonetheless, users of SES enclosures which verify the power status
of disks in the enclosure might _theoretically_ see a problem, iff
the fix has a problem (which has not been found yet).
[Other Info]
* None at this time.
==== State: Open by: nguyenp on 31 May 2017 15:46:14 ====
Product Name : OpenPOWER Firmware
Product Version : open-power-SMC-P8DTU-V2.00.GA2-20170126-prod
Product Extra : op-build-3782262
Product Extra : hostboot-7fdfb37
Product Extra : occ-e6e194f
Product Extra : skiboot-5.4.2
Product Extra : linux-4.4.24-openpower1-9641b3a
Product Extra : petitboot-v1.4.0-2f8598b
Product Extra : p8dtu-xml-9a8fee2
Cable configuration:
====================
On this P8-Briggs system, I have 2 Seagate Storages running with max configuration. There are 84 HDDs drives in each storage. So the total drives is 168 HDDs for both Seagate storages.
I connected 2 LSI 9300-8e SAS adapters to 2 Seagate storages with alternate cabling for redundancy. See a Figure on the connection below:
Note: Each Seagate storage has 2 I/O moudules connection in the back.
Both I/O modules from each Seagate does see the same set of HDDs
Cable connection:
SAS adapter #1: port1 -----> Seagate #1-A I/O module
port0 --------------------------------------> Seagate #2-B I/O module
SAS adapter #2: port1 ----> Seagate #2-A I/O module
port0 --------------------------------------> Seagate #1-B I/O module
Ubuntu 16.04.2:
===============
- Running with new kernel Ubuntu 4.8.0-520-generic #550~16.04.1+bz154734 from Mauricio Faria De Oliveira.
Problem Description:
====================
In this Briggs system, I'm running with new Ubuntu 4.8.0-520-generic #550~16.04.1+bz154734 that has fix for Multipath problem. Mauricio helped to patch the system with this kernel last week to fix the multipath io_setup failed problem in LTCBug154734.
This week, I went ahead and scaled up my test configuration to max configuration 2x5U84_Enclosures,_MaxCfg_168HDDs. This time, it hit a different issue. The issue is that some multipaths only have a single path and no redundancy. Others have multiple paths and redundancy.
== Comment: #13 - Paul Nguyen - 2017-06-01 15:19:58 ==
- I agreed with Mauricio that this problem is a timing problem.
- I re-ran the test and noticed that it took more than 50 minutes after system reboot to discover all disks and to build Multipaths correctly.
- So for it to take this long, it's going to be a problem.
- I have gathered all logs and attaching to the bug for Mauricio to look and confirm.
- If there is a workaround or fix for faster probe time then I will try it out.
- Below is more information I captured:
Checkpoint #1:
==============
- system reboot around 2pm (14:00)
Checkpoint # 2:
===============
- It took several minutes for first disk to be detected.
root@smb1p1:~# dmesg -T | grep 'sd 1[78]:' | grep 'Attached SCSI disk' | head
[Thu Jun 1 14:06:48 2017] sd 17:0:1:0: [sdb] Attached SCSI disk
[Thu Jun 1 14:06:51 2017] sd 17:0:2:0: [sdc] Attached SCSI disk
[Thu Jun 1 14:06:53 2017] sd 17:0:3:0: [sdd] Attached SCSI disk
[Thu Jun 1 14:06:57 2017] sd 17:0:4:0: [sde] Attached SCSI disk
[Thu Jun 1 14:07:00 2017] sd 17:0:5:0: [sdf] Attached SCSI disk
[Thu Jun 1 14:07:03 2017] sd 17:0:6:0: [sdg] Attached SCSI disk
[Thu Jun 1 14:07:05 2017] sd 17:0:7:0: [sdh] Attached SCSI disk
[Thu Jun 1 14:07:08 2017] sd 17:0:8:0: [sdi] Attached SCSI disk
[Thu Jun 1 14:07:11 2017] sd 17:0:9:0: [sdj] Attached SCSI disk
[Thu Jun 1 14:07:14 2017] sd 17:0:10:0: [sdk] Attached SCSI disk
root@smb1p1:~#
...
root@smb1p1:~# multipath -ll|grep dm |wc -l
103
root@smb1p1:~# dmesg -T | grep 'sd 1[78]:' | grep 'Attached SCSI disk' | tail
[Thu Jun 1 14:18:30 2017] sd 17:0:100:0: [sdcr] Attached SCSI disk
[Thu Jun 1 14:18:35 2017] sd 17:0:101:0: [sdcs] Attached SCSI disk
[Thu Jun 1 14:18:40 2017] sd 17:0:102:0: [sdct] Attached SCSI disk
[Thu Jun 1 14:18:44 2017] sd 17:0:103:0: [sdcu] Attached SCSI disk
[Thu Jun 1 14:18:54 2017] sd 17:0:105:0: [sdcv] Attached SCSI disk
[Thu Jun 1 14:18:59 2017] sd 17:0:106:0: [sdcw] Attached SCSI disk
[Thu Jun 1 14:19:04 2017] sd 17:0:107:0: [sdcx] Attached SCSI disk
[Thu Jun 1 14:19:09 2017] sd 17:0:108:0: [sdcy] Attached SCSI disk
[Thu Jun 1 14:19:14 2017] sd 17:0:109:0: [sdcz] Attached SCSI disk
[Thu Jun 1 14:19:19 2017] sd 17:0:110:0: [sdda] Attached SCSI disk
root@smb1p1:~#
...
root@smb1p1:~# multipath -ll|grep dm |wc -l
126
root@smb1p1:~# dmesg -T | grep 'sd 1[78]:' | grep 'Attached SCSI disk' | tail
[Thu Jun 1 14:20:23 2017] sd 17:0:123:0: [sddn] Attached SCSI disk
[Thu Jun 1 14:20:28 2017] sd 17:0:124:0: [sddo] Attached SCSI disk
[Thu Jun 1 14:20:33 2017] sd 17:0:125:0: [sddp] Attached SCSI disk
[Thu Jun 1 14:20:38 2017] sd 17:0:126:0: [sddq] Attached SCSI disk
[Thu Jun 1 14:20:44 2017] sd 17:0:127:0: [sddr] Attached SCSI disk
[Thu Jun 1 14:20:48 2017] sd 17:0:128:0: [sdds] Attached SCSI disk
[Thu Jun 1 14:20:54 2017] sd 17:0:129:0: [sddt] Attached SCSI disk
[Thu Jun 1 14:20:59 2017] sd 17:0:130:0: [sddu] Attached SCSI disk
[Thu Jun 1 14:21:04 2017] sd 17:0:131:0: [sddv] Attached SCSI disk
[Thu Jun 1 14:21:09 2017] sd 17:0:132:0: [sddw] Attached SCSI disk
root@smb1p1:~#
...
root@smb1p1:~# multipath -ll|grep dm |wc -l
142
root@smb1p1:~# dmesg -T | grep 'sd 1[78]:' | grep 'Attached SCSI disk' | tail
[Thu Jun 1 14:21:54 2017] sd 17:0:141:0: [sdee] Attached SCSI disk
[Thu Jun 1 14:21:58 2017] sd 17:0:142:0: [sdef] Attached SCSI disk
[Thu Jun 1 14:22:04 2017] sd 17:0:143:0: [sdeg] Attached SCSI disk
[Thu Jun 1 14:22:08 2017] sd 17:0:144:0: [sdeh] Attached SCSI disk
[Thu Jun 1 14:22:14 2017] sd 17:0:145:0: [sdei] Attached SCSI disk
[Thu Jun 1 14:22:18 2017] sd 17:0:146:0: [sdej] Attached SCSI disk
[Thu Jun 1 14:22:24 2017] sd 17:0:147:0: [sdek] Attached SCSI disk
[Thu Jun 1 14:22:29 2017] sd 17:0:148:0: [sdel] Attached SCSI disk
[Thu Jun 1 14:22:34 2017] sd 17:0:149:0: [sdem] Attached SCSI disk
[Thu Jun 1 14:22:39 2017] sd 17:0:150:0: [sden] Attached SCSI disk
root@smb1p1:~#
...
root@smb1p1:~# multipath -ll|grep dm |wc -l
163
root@smb1p1:~# dmesg -T | grep 'sd 1[78]:' | grep 'Attached SCSI disk' | tail
[Thu Jun 1 14:23:48 2017] sd 17:0:164:0: [sdfa] Attached SCSI disk
[Thu Jun 1 14:23:53 2017] sd 17:0:165:0: [sdfb] Attached SCSI disk
[Thu Jun 1 14:23:58 2017] sd 17:0:166:0: [sdfc] Attached SCSI disk
[Thu Jun 1 14:24:03 2017] sd 17:0:167:0: [sdfd] Attached SCSI disk
[Thu Jun 1 14:24:08 2017] sd 17:0:168:0: [sdfe] Attached SCSI disk
[Thu Jun 1 14:24:13 2017] sd 17:0:169:0: [sdff] Attached SCSI disk
[Thu Jun 1 14:24:19 2017] sd 17:0:170:0: [sdfg] Attached SCSI disk
[Thu Jun 1 14:24:23 2017] sd 17:0:171:0: [sdfh] Attached SCSI disk
[Thu Jun 1 14:24:28 2017] sd 17:0:172:0: [sdfi] Attached SCSI disk
[Thu Jun 1 14:24:33 2017] sd 17:0:173:0: [sdfj] Attached SCSI disk
...
root@smb1p1:~# dmesg -T | grep 'sd 1[78]:' | grep 'Attached SCSI disk' | tail
[Thu Jun 1 14:24:03 2017] sd 17:0:167:0: [sdfd] Attached SCSI disk
[Thu Jun 1 14:24:08 2017] sd 17:0:168:0: [sdfe] Attached SCSI disk
[Thu Jun 1 14:24:13 2017] sd 17:0:169:0: [sdff] Attached SCSI disk
[Thu Jun 1 14:24:19 2017] sd 17:0:170:0: [sdfg] Attached SCSI disk
[Thu Jun 1 14:24:23 2017] sd 17:0:171:0: [sdfh] Attached SCSI disk
[Thu Jun 1 14:24:28 2017] sd 17:0:172:0: [sdfi] Attached SCSI disk
[Thu Jun 1 14:24:33 2017] sd 17:0:173:0: [sdfj] Attached SCSI disk
[Thu Jun 1 14:24:38 2017] sd 17:0:174:0: [sdfk] Attached SCSI disk
[Thu Jun 1 14:24:43 2017] sd 17:0:175:0: [sdfl] Attached SCSI disk
[Thu Jun 1 14:24:48 2017] sd 17:0:176:0: [sdfm] Attached SCSI disk
root@smb1p1:~#
root@smb1p1:~# date
Thu Jun 1 14:27:03 CDT 2017
root@smb1p1:~# multipath -ll | grep -c 'sd[a-z]\+'
168
root@smb1p1:~#
Checkpoint #3:
=============
- After 34 minutes, multipath -ll command shows paths with single path and no redundancy.
root@smb1p1:~# multipath -ll > multipath.log.06012017.afterReboot
root@smb1p1:~# cat multipath.log.06012017.afterReboot |more
35000c50086a3ca97 dm-161 IBM-ESXS,ST10000NM0226 E
size=9.0T features='0' hwhandler='0' wp=rw
`-+- policy='round-robin 0' prio=1 status=active
`- 17:0:170:0 sdfg 130:32 active ready running
35000c50086bae8bf dm-144 IBM-ESXS,ST10000NM0226 E
size=9.0T features='0' hwhandler='0' wp=rw
`-+- policy='round-robin 0' prio=1 status=active
`- 17:0:152:0 sdep 129:16 active ready running
35000c50086baa42f dm-143 IBM-ESXS,ST10000NM0226 E
size=9.0T features='0' hwhandler='0' wp=rw
`-+- policy='round-robin 0' prio=1 status=active
`- 17:0:151:0 sdeo 129:0 active ready running
...
Check point #4:
===============
- After 43 minutes, multipath -ll command shows some paths with only single path and no redundancy and some path with multiple paths and redundancy.
root@smb1p1:~# date
Thu Jun 1 14:43:00 CDT 2017
root@smb1p1:~# multipath -ll | grep -c 'sd[a-z]\+'
252
root@smb1p1:~#
Checkpoint #5:
==============
- After 47 minutes, multipath -ll command still shows some paths with only single path and no redundancy.
root@smb1p1:~# dmesg -T | grep 'sd 1[78]:' | grep 'Attached SCSI disk' | head
[Thu Jun 1 14:06:48 2017] sd 17:0:1:0: [sdb] Attached SCSI disk
[Thu Jun 1 14:06:51 2017] sd 17:0:2:0: [sdc] Attached SCSI disk
[Thu Jun 1 14:06:53 2017] sd 17:0:3:0: [sdd] Attached SCSI disk
[Thu Jun 1 14:06:57 2017] sd 17:0:4:0: [sde] Attached SCSI disk
[Thu Jun 1 14:07:00 2017] sd 17:0:5:0: [sdf] Attached SCSI disk
[Thu Jun 1 14:07:03 2017] sd 17:0:6:0: [sdg] Attached SCSI disk
[Thu Jun 1 14:07:05 2017] sd 17:0:7:0: [sdh] Attached SCSI disk
[Thu Jun 1 14:07:08 2017] sd 17:0:8:0: [sdi] Attached SCSI disk
[Thu Jun 1 14:07:11 2017] sd 17:0:9:0: [sdj] Attached SCSI disk
[Thu Jun 1 14:07:14 2017] sd 17:0:10:0: [sdk] Attached SCSI disk
root@smb1p1:~# dmesg -T | grep 'sd 1[78]:' | grep 'Attached SCSI disk' | tail
[Thu Jun 1 14:46:15 2017] sd 18:0:112:0: [sdjo] Attached SCSI disk
[Thu Jun 1 14:46:20 2017] sd 18:0:113:0: [sdjp] Attached SCSI disk
[Thu Jun 1 14:46:25 2017] sd 18:0:114:0: [sdjq] Attached SCSI disk
[Thu Jun 1 14:46:31 2017] sd 18:0:115:0: [sdjr] Attached SCSI disk
[Thu Jun 1 14:46:36 2017] sd 18:0:116:0: [sdjs] Attached SCSI disk
[Thu Jun 1 14:46:41 2017] sd 18:0:117:0: [sdjt] Attached SCSI disk
[Thu Jun 1 14:46:46 2017] sd 18:0:118:0: [sdju] Attached SCSI disk
[Thu Jun 1 14:46:51 2017] sd 18:0:119:0: [sdjv] Attached SCSI disk
[Thu Jun 1 14:46:56 2017] sd 18:0:120:0: [sdjw] Attached SCSI disk
[Thu Jun 1 14:47:01 2017] sd 18:0:121:0: [sdjx] Attached SCSI disk
root@smb1p1:~#
root@smb1p1:~#
root@smb1p1:~# date
Thu Jun 1 14:47:20 CDT 2017
root@smb1p1:~# multipath -ll | grep -c 'sd[a-z]\+'
288
root@smb1p1:~#
Checkpoint #6:
==============
- After 51 minutes after system reboot, looks like all disk are discovered and the Multipath is correctly built.
root@smb1p1:~# multipath -ll | grep -c 'sd[a-z]\+'
336
root@smb1p1:~# date
Thu Jun 1 14:52:05 CDT 2017
root@smb1p1:~# dmesg -T | grep 'sd 1[78]:' | grep 'Attached SCSI disk' | tail
[Thu Jun 1 14:50:47 2017] sd 18:0:167:0: [sdlp] Attached SCSI disk
[Thu Jun 1 14:50:52 2017] sd 18:0:168:0: [sdlq] Attached SCSI disk
[Thu Jun 1 14:50:57 2017] sd 18:0:169:0: [sdlr] Attached SCSI disk
[Thu Jun 1 14:51:02 2017] sd 18:0:170:0: [sdls] Attached SCSI disk
[Thu Jun 1 14:51:07 2017] sd 18:0:171:0: [sdlt] Attached SCSI disk
[Thu Jun 1 14:51:13 2017] sd 18:0:172:0: [sdlu] Attached SCSI disk
[Thu Jun 1 14:51:17 2017] sd 18:0:173:0: [sdlv] Attached SCSI disk
[Thu Jun 1 14:51:22 2017] sd 18:0:174:0: [sdlw] Attached SCSI disk
[Thu Jun 1 14:51:27 2017] sd 18:0:175:0: [sdlx] Attached SCSI disk
[Thu Jun 1 14:51:33 2017] sd 18:0:176:0: [sdly] Attached SCSI disk
root@smb1p1:~#
== Comment: #24 - Mauricio Faria De Oliveira - 2017-06-06 11:42:59 ==
Hi Paul,
Per your logs, yes, it's the slowness with the SES driver.
I'll ask Canonical to pick it up for 16.10 and 17.04 so it makes into 16.04.2 and 16.04.3.
Thanks,
Mauricio
== Comment: #26 - Mauricio Faria De Oliveira <mauricfo@br.ibm.com> - 2017-06-06 12:06:32 ==
The patch applies cleanly in the master-next branch of ubuntu-zesty.git and ubuntu-yakkety.git.
Mirroring to Canonical to get a LP bug number, required in the submission process.
== Comment: #27 - Mauricio Faria De Oliveira <mauricfo@br.ibm.com> - 2017-06-06 12:07:58 ==
The commit is [1].
commit 75106523f39751390b5789b36ee1d213b3af1945
Author: Mauricio Faria de Oliveira <mauricfo@linux.vnet.ibm.com>
Date: Wed Apr 5 12:18:19 2017 -0300
scsi: ses: don't get power status of SES device slot on probe
[1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=75106523f39751390b5789b36ee1d213b3af1945 |
[Impact]
* The SES driver causes a long delay in disk discovery when
a large number of disks is present in the disk enclosure,
which increases with the number of disks attached.
* This delays the addition and visibility of the disk devices
to userspace, which among other things causes multipath not
to have multiple paths, actually, until the disk discovery
eventually/finally finishes.
* The fix significantly shortens the time taken by the SES
driver to handle disk discovery, causing no extra delays,
by removing a superfluous SCSI command sent to enclosure.
[Test Case]
* Load the module to access the enclosure and its disks; e.g.,
$ sudo modprobe mpt3sas
* Notice the interval between the discovery of each disk; e.g., dmesg
$ dmesg -T | grep 'Attached SCSI disk' | tail -n2
[Thu Jun 1 14:18:30 2017] sd 17:0:100:0: [sdcr] Attached SCSI disk
[Thu Jun 1 14:18:35 2017] sd 17:0:101:0: [sdcs] Attached SCSI disk
* The interval should be in the same second or so range with the fix.
$ dmesg -T | grep 'Attached SCSI disk' | tail -n2
[Wed Jun 7 13:11:59 2017] sd 18:0:176:0: [sdly] Attached SCSI disk
[Wed Jun 7 13:11:59 2017] sd 18:0:175:0: [sdlx] Attached SCSI disk
[Regression Potential]
* The power status of the disks in the enclosure is no longer
checked during probe time. However, the patch demonstrates that
initial value was never used in any way. So, little regression
potential.
* Nonetheless, users of SES enclosures which verify the power status
of disks in the enclosure might _theoretically_ see a problem, iff
the fix has a problem (which has not been found yet).
[Other Info]
* None at this time.
Problem Description:
====================
This week, I went ahead and scaled up my test configuration to max configuration 2x5U84_Enclosures,_MaxCfg_168HDDs. This time, it hit a different issue. The issue is that some multipaths only have a single path and no redundancy. Others have multiple paths and redundancy.
Checkpoint #1:
==============
- system reboot around 2pm (14:00)
Checkpoint # 2:
===============
- It took several minutes for first disk to be detected.
root@smb1p1:~# multipath -ll|grep dm |wc -l
103
root@smb1p1:~# dmesg -T | grep 'sd 1[78]:' | grep 'Attached SCSI disk' | tail
[Thu Jun 1 14:18:30 2017] sd 17:0:100:0: [sdcr] Attached SCSI disk
[Thu Jun 1 14:18:35 2017] sd 17:0:101:0: [sdcs] Attached SCSI disk
[Thu Jun 1 14:18:40 2017] sd 17:0:102:0: [sdct] Attached SCSI disk
[Thu Jun 1 14:18:44 2017] sd 17:0:103:0: [sdcu] Attached SCSI disk
[Thu Jun 1 14:18:54 2017] sd 17:0:105:0: [sdcv] Attached SCSI disk
[Thu Jun 1 14:18:59 2017] sd 17:0:106:0: [sdcw] Attached SCSI disk
[Thu Jun 1 14:19:04 2017] sd 17:0:107:0: [sdcx] Attached SCSI disk
[Thu Jun 1 14:19:09 2017] sd 17:0:108:0: [sdcy] Attached SCSI disk
[Thu Jun 1 14:19:14 2017] sd 17:0:109:0: [sdcz] Attached SCSI disk
[Thu Jun 1 14:19:19 2017] sd 17:0:110:0: [sdda] Attached SCSI disk
root@smb1p1:~#
...
root@smb1p1:~# multipath -ll|grep dm |wc -l
142
root@smb1p1:~# dmesg -T | grep 'sd 1[78]:' | grep 'Attached SCSI disk' | tail
[Thu Jun 1 14:21:54 2017] sd 17:0:141:0: [sdee] Attached SCSI disk
[Thu Jun 1 14:21:58 2017] sd 17:0:142:0: [sdef] Attached SCSI disk
[Thu Jun 1 14:22:04 2017] sd 17:0:143:0: [sdeg] Attached SCSI disk
[Thu Jun 1 14:22:08 2017] sd 17:0:144:0: [sdeh] Attached SCSI disk
[Thu Jun 1 14:22:14 2017] sd 17:0:145:0: [sdei] Attached SCSI disk
[Thu Jun 1 14:22:18 2017] sd 17:0:146:0: [sdej] Attached SCSI disk
[Thu Jun 1 14:22:24 2017] sd 17:0:147:0: [sdek] Attached SCSI disk
[Thu Jun 1 14:22:29 2017] sd 17:0:148:0: [sdel] Attached SCSI disk
[Thu Jun 1 14:22:34 2017] sd 17:0:149:0: [sdem] Attached SCSI disk
[Thu Jun 1 14:22:39 2017] sd 17:0:150:0: [sden] Attached SCSI disk
root@smb1p1:~#
...
- After 43 minutes, multipath -ll command shows some paths with only single path and no redundancy and some path with multiple paths and redundancy.
root@smb1p1:~# date
Thu Jun 1 14:43:00 CDT 2017
root@smb1p1:~# multipath -ll | grep -c 'sd[a-z]\+'
252
root@smb1p1:~#
...
- After 47 minutes, multipath -ll command still shows some paths with only single path and no redundancy.
root@smb1p1:~# multipath -ll | grep -c 'sd[a-z]\+'
288
root@smb1p1:~#
- After 51 minutes after system reboot, looks like all disk are discovered and the Multipath is correctly built.
root@smb1p1:~# multipath -ll | grep -c 'sd[a-z]\+'
336
== Comment: #24 - Mauricio Faria De Oliveira - 2017-06-06 11:42:59 ==
Hi Paul,
Per your logs, yes, it's the slowness with the SES driver.
I'll ask Canonical to pick it up for 16.10 and 17.04 so it makes into 16.04.2 and 16.04.3.
Thanks,
Mauricio
== Comment: #26 - Mauricio Faria De Oliveira <mauricfo@br.ibm.com> - 2017-06-06 12:06:32 ==
The patch applies cleanly in the master-next branch of ubuntu-zesty.git and ubuntu-yakkety.git.
Mirroring to Canonical to get a LP bug number, required in the submission process.
== Comment: #27 - Mauricio Faria De Oliveira <mauricfo@br.ibm.com> - 2017-06-06 12:07:58 ==
The commit is [1].
commit 75106523f39751390b5789b36ee1d213b3af1945
Author: Mauricio Faria de Oliveira <mauricfo@linux.vnet.ibm.com>
Date: Wed Apr 5 12:18:19 2017 -0300
scsi: ses: don't get power status of SES device slot on probe
[1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=75106523f39751390b5789b36ee1d213b3af1945 |
|
2017-06-07 20:22:20 |
Joseph Salisbury |
tags |
architecture-ppc64le bugnameltc-155269 severity-high targetmilestone-inin16043 |
architecture-ppc64le bugnameltc-155269 kernel-da-key severity-high targetmilestone-inin16043 |
|
2017-06-08 06:11:57 |
Frank Heimes |
bug task added |
|
ubuntu-power-systems |
|
2017-06-08 06:12:31 |
Frank Heimes |
ubuntu-power-systems: assignee |
|
Canonical Kernel Team (canonical-kernel-team) |
|
2017-06-08 08:07:50 |
Stefan Bader |
nominated for series |
|
Ubuntu Yakkety |
|
2017-06-08 08:07:50 |
Stefan Bader |
bug task added |
|
linux (Ubuntu Yakkety) |
|
2017-06-08 08:07:50 |
Stefan Bader |
nominated for series |
|
Ubuntu Zesty |
|
2017-06-08 08:07:50 |
Stefan Bader |
bug task added |
|
linux (Ubuntu Zesty) |
|
2017-06-10 00:30:18 |
bugproxy |
attachment added |
|
Attach multipath log file that has several failed paths with no redundancy. https://bugs.launchpad.net/bugs/1696445/+attachment/4893337/+files/multipath.log.05312017 |
|
2017-06-21 09:19:16 |
Stefan Bader |
linux (Ubuntu Zesty): status |
New |
Fix Committed |
|
2017-06-21 09:19:22 |
Stefan Bader |
linux (Ubuntu Yakkety): status |
New |
Fix Committed |
|
2017-06-21 09:19:27 |
Stefan Bader |
linux (Ubuntu Zesty): importance |
Undecided |
Medium |
|
2017-06-21 09:19:30 |
Stefan Bader |
linux (Ubuntu Yakkety): importance |
Undecided |
Medium |
|
2017-06-21 09:44:05 |
Frank Heimes |
ubuntu-power-systems: status |
New |
Fix Committed |
|
2017-07-10 08:22:31 |
Kleber Sacilotto de Souza |
tags |
architecture-ppc64le bugnameltc-155269 kernel-da-key severity-high targetmilestone-inin16043 |
architecture-ppc64le bugnameltc-155269 kernel-da-key severity-high targetmilestone-inin16043 verification-needed-yakkety |
|
2017-07-10 08:23:43 |
Kleber Sacilotto de Souza |
tags |
architecture-ppc64le bugnameltc-155269 kernel-da-key severity-high targetmilestone-inin16043 verification-needed-yakkety |
architecture-ppc64le bugnameltc-155269 kernel-da-key severity-high targetmilestone-inin16043 verification-needed-yakkety verification-needed-zesty |
|
2017-07-11 16:10:37 |
bugproxy |
tags |
architecture-ppc64le bugnameltc-155269 kernel-da-key severity-high targetmilestone-inin16043 verification-needed-yakkety verification-needed-zesty |
architecture-ppc64le bugnameltc-155269 kernel-da-key severity-high targetmilestone-inin16043 verification-done-yakkety verification-done-zesty |
|
2017-07-12 12:13:16 |
Launchpad Janitor |
linux (Ubuntu): status |
New |
Fix Released |
|
2017-07-17 11:38:54 |
Launchpad Janitor |
linux (Ubuntu Yakkety): status |
Fix Committed |
Fix Released |
|
2017-07-17 11:38:54 |
Launchpad Janitor |
cve linked |
|
2014-9900 |
|
2017-07-17 11:38:54 |
Launchpad Janitor |
cve linked |
|
2016-9755 |
|
2017-07-17 11:38:54 |
Launchpad Janitor |
cve linked |
|
2017-1000380 |
|
2017-07-17 11:38:54 |
Launchpad Janitor |
cve linked |
|
2017-5551 |
|
2017-07-17 11:38:54 |
Launchpad Janitor |
cve linked |
|
2017-5576 |
|
2017-07-17 11:38:54 |
Launchpad Janitor |
cve linked |
|
2017-7346 |
|
2017-07-17 11:38:54 |
Launchpad Janitor |
cve linked |
|
2017-7895 |
|
2017-07-17 11:38:54 |
Launchpad Janitor |
cve linked |
|
2017-8924 |
|
2017-07-17 11:38:54 |
Launchpad Janitor |
cve linked |
|
2017-8925 |
|
2017-07-17 11:38:54 |
Launchpad Janitor |
cve linked |
|
2017-9074 |
|
2017-07-17 11:38:54 |
Launchpad Janitor |
cve linked |
|
2017-9150 |
|
2017-07-17 11:38:54 |
Launchpad Janitor |
cve linked |
|
2017-9605 |
|
2017-07-17 11:38:54 |
Launchpad Janitor |
linux (Ubuntu Yakkety): status |
Fix Committed |
Fix Released |
|
2017-07-17 11:57:59 |
Launchpad Janitor |
linux (Ubuntu Zesty): status |
Fix Committed |
Fix Released |
|
2017-07-17 13:27:31 |
Frank Heimes |
ubuntu-power-systems: status |
Fix Committed |
Fix Released |
|
2017-07-21 15:06:50 |
Stefan Bader |
nominated for series |
|
Ubuntu Xenial |
|
2017-07-21 15:06:50 |
Stefan Bader |
bug task added |
|
linux (Ubuntu Xenial) |
|
2017-07-21 17:43:06 |
Joseph Salisbury |
linux (Ubuntu Xenial): importance |
Undecided |
Medium |
|
2017-07-21 17:43:09 |
Joseph Salisbury |
linux (Ubuntu Xenial): status |
New |
Triaged |
|
2017-08-07 11:42:36 |
Kleber Sacilotto de Souza |
linux (Ubuntu Xenial): status |
Triaged |
Fix Committed |
|
2017-08-16 16:33:42 |
Kleber Sacilotto de Souza |
tags |
architecture-ppc64le bugnameltc-155269 kernel-da-key severity-high targetmilestone-inin16043 verification-done-yakkety verification-done-zesty |
architecture-ppc64le bugnameltc-155269 kernel-da-key severity-high targetmilestone-inin16043 verification-done-yakkety verification-done-zesty verification-needed-xenial |
|
2017-08-28 10:14:02 |
Launchpad Janitor |
linux (Ubuntu Xenial): status |
Fix Committed |
Fix Released |
|
2017-08-28 10:14:02 |
Launchpad Janitor |
cve linked |
|
2015-7837 |
|
2017-08-28 10:14:02 |
Launchpad Janitor |
cve linked |
|
2017-1000111 |
|
2017-08-28 10:14:02 |
Launchpad Janitor |
cve linked |
|
2017-1000112 |
|
2017-08-28 10:14:02 |
Launchpad Janitor |
cve linked |
|
2017-7495 |
|