Slow formatting on SSDs in mdadm RAID10 with LVM and XFS

Bug #1882979 reported by David Andruczyk
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
MAAS
Incomplete
Undecided
Unassigned
curtin
New
Undecided
Unassigned

Bug Description

MAAS: 2.6.2-7841-ga10625be3-0ubuntu1~18.04.1

Building a machine with a RAID10 of 12 4T SSD's partitioned into two PV's for two separate LVM groups (750G and the remainder). Formatting of XFS blocks:

Running command['mkfs.xfs', '-f', '-L', '', '-m', 'uuid=<uuid>', /dev/<vg-name>/<lv-name>

INFO: tags: md2_resync:12438 blocked for more than 120 seconds
      Tainted: P O 5.4.0-37-generic #41-Ubuntu
....
INFO: tags: mkfs.xfs:13764 blocked for more than 120 seconds
      Tainted: P O 5.4.0-37-generic #41-Ubuntu
....

Logging into the deploying instance shows that the MDADM array is rebuilding at an abysmally slow speed of 5K/sec (not a typo)
$ cat /proc/mdstat
Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10]
md1 : active raid1 sdn3[1] sdm2[0]
      5848064 blocks super 1.2 [2/2] [UU]
       resync=DELAYED

md0 : active raid1 sdn2[1] sdm1[0]
      228436992 blocks super 1.2 [2/2] [UU]
      [=========>...........] resync = 48.9% (111858496/228436992) finish=14.0min speed=138180K/sec
      bitmap: 2/2 pages [8KB], 65536KB chunk

md2 : active raid10 sdl2[11] sdk2[10] sdj2[9] sdi2[8] sdh2[7] sdg2[6] sdf2[5] sde2[4] sdd2[3] sdc2[2] sdb2[1] sda2[0]
      22503598080 blocks super 1.2 512K chunks 2 near-copies [12/12] [UUUUUUUUUUUU]
      [>....................] resync = 0.0% (6832000/22503598080) finish=65615567.6min speed=5K/sec
      bitmap: 168/168 pages [672KB], 65536KB chunk

unused devices: <none>

installing iostat (sudo apt install sysstat) shows a high discard rate on the drives on md2
avg-cpu: %user %nice %system %iowait %steal %idle
           0.04 0.00 0.43 2.90 0.00 96.62

Device tps kB_read/s kB_wrtn/s kB_dscd/s kB_read kB_wrtn kB_dscd
dm-0 0.00 0.00 0.00 0.00 0 0 0
dm-1 0.00 0.00 0.00 0.00 0 0 0
loop0 0.00 0.00 0.00 0.00 0 0 0
loop1 0.00 0.00 0.00 0.00 0 0 0
loop2 0.00 0.00 0.00 0.00 0 0 0
loop3 0.00 0.00 0.00 0.00 0 0 0
md0 0.00 0.00 0.00 0.00 0 0 0
md1 0.00 0.00 0.00 0.00 0 0 0
md2 0.00 0.00 0.00 0.00 0 0 0
sda 419.60 0.00 0.00 107417.60 0 0 537088
sdb 418.80 0.00 0.00 107212.80 0 0 536064
sdc 418.80 0.00 0.00 107212.80 0 0 536064
sdd 420.00 0.00 0.00 107520.00 0 0 537600
sde 420.80 0.00 0.00 107724.80 0 0 538624
sdf 419.60 0.00 0.00 107417.60 0 0 537088
sdg 419.60 0.00 0.00 107417.60 0 0 537088
sdh 419.60 0.00 0.00 107417.60 0 0 537088
sdi 419.20 0.00 0.00 107315.20 0 0 536576
sdj 420.00 0.00 0.00 107520.00 0 0 537600
sdk 419.00 0.00 0.00 107310.40 0 0 536552
sdl 419.60 0.00 0.00 107417.60 0 0 537088
sdm 248.40 123276.80 0.30 0.00 616384 1 0
sdn 248.40 0.00 123123.50 0.00 0 615617 0

MaaS gui will eventually timeout and report "deployment failed" however the deploy WILL COMPLETE EVENTUALLY, usually in as long as 2-3 HOURS if left alone. When the host reboots into the target OS (xenial), the RAID10 rebuild speed will return to normal (throttled at 200MB/sec by /proc/sys/dev/raid/speed_limit_max), if that value is increased the array will rebuild at up to 1.3GB/sec, which is likely the bandwidth limit of the backplane to board connections.

This is on a host where I had issued a full blkdiscard of all SSD's from an ephemeral environment to provide "clean slates" which should give the fastest performance from the SSD's. If the drives had a previous configuration from a previous deployment the install wouldn't even get this far, it would block on trying to remove the old stuff (separate bug filed for this already https://bugs.launchpad.net/maas/+bug/1882964

Revision history for this message
Alberto Donato (ack) wrote :

Adding curtin as I'm not sure whether there's something maas could configure different for this case

Revision history for this message
Alberto Donato (ack) wrote :

Also, could you please attach the curtin config used for the deployment?
It's left in the /root dir during deployment.

Changed in maas:
status: New → Incomplete
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.