[Ubuntu 22.04] Allow ESP installation on md-disks in subiquity

Bug #1961079 reported by Sujith Pandel
24
This bug affects 1 person
Affects Status Importance Assigned to Milestone
grub2 (Ubuntu)
Incomplete
Undecided
Unassigned
Focal
New
Undecided
Unassigned
Jammy
Incomplete
Undecided
Unassigned
subiquity (Ubuntu)
In Progress
Undecided
Dan Bungert
Focal
Invalid
Undecided
Unassigned
Jammy
Triaged
Undecided
Dan Bungert

Bug Description

subiquity installer currently does not allow to select md-disk as Installation destination where /, /home and /boot/efi could be created and installed.
Request is to enable this feature so that we can install and boot from md-disk, which our BIOS creates.

Revision history for this message
Michael Reed (mreed8855) wrote :

This bug is an updated version of an older issue. https://bugs.launchpad.net/bugs/1466150

Revision history for this message
Dan Bungert (dbungert) wrote :

So I see the linked bug about raid.

Can you clarify for me what md-disk refers to in this context? Search results are pointing me to MiniDisk and Markdown and those are obviously not correct.

Changed in grub2 (Ubuntu Jammy):
status: New → Incomplete
Revision history for this message
Sujith Pandel (sujithpandel) wrote :

Our BIOS uses mdadm and creates md-raid disks (RAID-1) with metadata 1.2 through Dell SW-RAID solution like S140/S150 etc inside BIOS.
We support install and boot from these md-raid disks.

https://raid.wiki.kernel.org/index.php/A_guide_to_mdadm
https://dl.dell.com/topicspdf/s150-users-guide_en-us.pdf

Revision history for this message
Sujith Pandel (sujithpandel) wrote :

Steps:
1. Setup a DellEMC 14G PE server with 2 Onboard SATA disks.
2. Setup a md-raid (RAID1) on these SATA disks through BIOS S140 controller in UEFI mode.
3. Start installing Ubuntu 22.04
4. See that user cannot select md-raid disk as boot disk. Installation on md-raid disk cannot proceed.

Revision history for this message
Sujith Pandel (sujithpandel) wrote :
Revision history for this message
Sujith Pandel (sujithpandel) wrote :
Revision history for this message
Sujith Pandel (sujithpandel) wrote :

Description of the feature request:

The DellEMC PowerEdge RAID Controller(PERC) S130/S140 is RAID solution for the DellEMC PowerEdge systems configured through Chipset SATA controller. It does not use any dedicated Hardware for RAID operations. It is set up/configured by the system BIOS. Using the BIOS/UEFI Configuration utility user can configure RAID on onboard SATA/NVMe Drives and install OS and boot from it.

To make use of S130/S140 solution, the user need to change the SATA mode to RAID in BIOS/UEFI Configuration.
During S130/S140 configuration, the VDs (Virtual Disk) are created by system BIOS by writing metadata on the selected onboard SATA or NVMe drives. The metadata format used is MD version 1.2. The resulting VD is a Linux MDRAID VD.

This is the feature that we are requesting - to get install and boot support for Ubuntu 22.04 and future releases of Ubuntu 20.04

Revision history for this message
Sujith Pandel (sujithpandel) wrote :
Revision history for this message
Julian Andres Klode (juliank) wrote :

This is very unfortunate. So the UEFI system is able to read mdraid 1.2 too, and can access ESPs inside of it?

Most systems do not implement linux's mdraid in their firmware, and hence putting the ESP inside the RAID would lead to unbootable systems; which is why we designed the solution where you can place an ESP on each disk outside the raid and then raid the partitions instead.

There's some caveats in that systems accidentally can read ESPs in an mdraid 1.0 as the metadata is at the end only, but in general exposing that would be a bad idea.

So I guess one could implement a quirk to allow you to place ESPs in there on this specific hardware, but creating such RAIDs and placing ESPs in them in general should not be possible.

Revision history for this message
Sujith Pandel (sujithpandel) wrote :

>So the UEFI system is able to read mdraid 1.2 too, and can access ESPs inside of it?

Yes, currently install & boot with mdraid 1.2 works for other OS like CentOS etc.

Ubuntu subiquity installer seems to be blocking it for ESP/Boot-device.
Ubuntu legacy debian installer was allowing md-raid disk to be selected as Installation destination (including ESP), however we were facing grub install error during installation as in https://bugs.launchpad.net/ubuntu/+source/grub2/+bug/1776685 and https://bugs.launchpad.net/ubuntu/+source/grub-installer/+bug/1466150

Revision history for this message
Sujith Pandel (sujithpandel) wrote :

I believe the code changes has to go to these files:
https://github.com/canonical/subiquity/blob/main/subiquity/common/filesystem/boot.py
and
https://github.com/canonical/subiquity/blob/main/subiquity/common/filesystem/actions.py

Currently it allows only imsm containers and hence our md-raid 1.2 disks aren't being considered as Boot drive candidate.

What kind of subiquity quirk do you propose to enable Linux md-raid 1.2 disks?

Revision history for this message
Sujith Pandel (sujithpandel) wrote :

Also, we feel even after subiquity allows mdraid-disk installations, we might be hitting https://bugs.launchpad.net/ubuntu/+source/grub2/+bug/1776685 here as well. So we need to work on that as well to get a complete working solution.

Revision history for this message
Michael Hudson-Doyle (mwhudson) wrote :

You are correct that subiquity does not allow you to put the ESP on a metadata 1.2 MD RAID device. That's because, to the best of my knowledge, almost all firmware does not know about MD RAID formats and so would be unable to boot the system.

The question thus becomes, how can the installer know that putting the ESP on a metadata 1.2 MD RAID is going to work? I presume we can identify your firmware by looking at sysfs somehow or other?

I still wouldn't be amazingly happy about doing this because it is a general goal of Ubuntu's installers that you can transfer a disk between systems and still have it boot, but that's probably not so important for the server use case.

Revision history for this message
prabhakar pujeri (prabhakarpujeri) wrote :

Hi Michael,

I am checking internally if we have any way to tell bootable 1.2 metadata or not.

but all Dell PowerEdge systems are boot supported with 1.2 metadata. we can filter with this for dell.

Revision history for this message
prabhakar pujeri (prabhakarpujeri) wrote :

Hi Michael,

we had an internal discussion with our SW RAID team and we don't have any option in metadata to tell the booting disk or data disk. The only way suggested is to filter the Dell server and all dell servers have support for mdraid boot.

Changed in subiquity (Ubuntu):
milestone: none → ubuntu-22.04.2
Revision history for this message
Christian Ekstam (cekstam) wrote :

All servers utilizing Intel VROC (https://www.intel.com/content/www/us/en/software/virtual-raid-on-cpu-vroc.html) are unable to install Ubuntu due to this limitation.

MD RAID metadata and all these assembled RAID arrays usually contain a GPT partition table that the UEFI systems can read (possibly by using the copy on the end of the disk?) so it works fine to boot the systems once installed.

The current problem is that it's not possible to install with this limitation in subiquity (and possibly grub). This seems to have worked fine in for example 18.04 (per https://www.intel.com/content/dam/support/us/en/documents/memory-and-storage/ssd-software/VROC-Ubuntu-Setup-UserGuide-342787-US.pdf), and I would as such call it a regression.

Please put back space bar heating (https://xkcd.com/1172/).

Best,
Christian

Revision history for this message
Michael Hudson-Doyle (mwhudson) wrote :

There is specific support for VROC in the installer, this is about installing on boring old md raid on systems that support booting from it (which is definitely not all systems). If you are unable to install on a VROC system please file a new bug with log files.

Revision history for this message
prabhakar pujeri (prabhakarpujeri) wrote :

this bug is not for VROC. Dell PowerEdge servers support md raid boot support so we requesting to unblock mdraid block devices in the installer. but

I also understand you don't want to unblock generically for all mdraid. most of them do not support booting.

is it possible to add a check if the Dell PowerEdge server then allows installation?

Dan Bungert (dbungert)
Changed in subiquity (Ubuntu):
milestone: ubuntu-22.04.2 → none
Revision history for this message
Michael Reed (mreed8855) wrote :

A way to determine if a PowerEdge server is present is the following command:

dmidecode -t 1

$ sudo dmidecode -t 1
# dmidecode 3.3
Getting SMBIOS data from sysfs.
SMBIOS 3.3.0 present.

Handle 0x0100, DMI type 1, 27 bytes
System Information
 Manufacturer: Dell Inc.
 Product Name: PowerEdge R6525
 Version: Not Specified
 Serial Number: 243KTH3
 UUID: 4c4c4544-0034-3310-804b-b2c04f544833
 Wake-up Type: Power Switch
 SKU Number: SKU=NotProvided;ModelName=PowerEdge R6525
 Family: PowerEdge

Revision history for this message
Dan Bungert (dbungert) wrote :

Marking Focal target as invalid, we aren't expecting new release media.
Someone on Focal could still take the snap refresh and pick up the fix, when available, but there is nothing series specific about that.

Checking system-product-name is fine.

But is this really correct for every PowerEdge ever? https://en.wikipedia.org/wiki/PowerEdge says these things go back to 1996, presumably not all of them are valid? I'd like to avoid the sort of "subiquity doesn't boot on my really old hardware" type bug.

Changed in subiquity (Ubuntu Focal):
status: New → Invalid
Dan Bungert (dbungert)
Changed in subiquity (Ubuntu Jammy):
status: New → In Progress
assignee: nobody → Dan Bungert (dbungert)
Revision history for this message
Dan Bungert (dbungert) wrote :

I have uploaded a test build of Subiquity with this feature enabled.
I lack appropriate hardware to verify for myself, so please help test.

To do so, please boot
https://releases.ubuntu.com/jammy/ubuntu-22.04.2-live-server-amd64.iso
with kernel command line
subiquity-channel=beta/lp-1961079

At the "Installer update available" screen, allow the update to
proceed. Continue the install as normal. If done on a computer with a
matching system-product-name, the raid device should be selectable as
a permissible boot device.

Changed in subiquity (Ubuntu):
status: New → In Progress
assignee: nobody → Dan Bungert (dbungert)
Revision history for this message
Mr John Paul Cooper (johnpaul-cooper) wrote (last edit ):

Perhaps have it so that it checks that the Dell PowerEdge is new enough generation family more than the minimum, required. That it has has a product which does the RAID in this manner, so if it is older than the minimum where its not done this way then continue as it did before.

Revision history for this message
Vinay HM (vinay-hm) wrote (last edit ):

Hi dbungert,

To do so, please boot
https://releases.ubuntu.com/jammy/ubuntu-22.04.2-live-server-amd64.iso
with kernel command line
subiquity-channel=beta/lp-1961079

At the "Installer update available" screen, allow the update to
proceed. Continue the install as normal

--> I have tried installing 22.04.2 by passing "subiquity-channel=beta/lp-1961079" kernel parameter. But as you said above, I did not get any request to update at installer level. It is proceeding like normal installation and disks are detected as normal nvme drives not as md-raid disks.

Revision history for this message
Dan Bungert (dbungert) wrote :

Snap branches last 30 days. Unfortunately it's been longer than that, which means that the test run yesterday didn't have access to the beta build, so that's an invalid test.

Here's a new one. I have rebuilt the snap against current main branch. While I can't test the main functionality we're trying to merge I can at least confirm that the beta/lp-1961079 is functional. Please test with 22.04.3, before October-6 so we don't hit the same problem. Thank you.

Revision history for this message
Kumar Harshith Gowda K S (harsh-3125) wrote :

Hi dbungert,

We tried installing 22.04.3 by passing "subiquity-channel=beta/lp-1961079" kernel parameter, beta/lp-1961079 is functional and installer can be updated.

After installer is loaded, we could be able to see Software-Raid (md-raid) disk, but along with that drives which were part of the md-raid were also listed as individual drives which shouldn't happen.

However, if we select md-raid disk and proceed to OS installation, before installation begins, installer reports error with install_fail.crash. Crash logs are attached.

Revision history for this message
Dan Bungert (dbungert) wrote :

We'll need the contents of /var/log/installer from the failed run to make further progress. Tarball please.

Changed in subiquity (Ubuntu Jammy):
status: In Progress → Incomplete
Revision history for this message
Kumar Harshith Gowda K S (harsh-3125) wrote (last edit ):

Attaching /var/log/installer logs.

Dan Bungert (dbungert)
Changed in subiquity (Ubuntu Jammy):
status: Incomplete → Triaged
Revision history for this message
Kumar Harshith Gowda K S (harsh-3125) wrote :

Hi dbungert,

Do we have any update?
any findings from log analysis?

Revision history for this message
Dan Bungert (dbungert) wrote (last edit ):

Hi Kumar,

Thanks for the previous set of logs, that revealed a regression which we believe now fixed. I've been able to test other raid cases but not the one this bug is about, so again a test build is available with this feature rebased on top of the regression fix. Same plan as before, subiquity-channel=beta/lp-1961079.

Revision history for this message
Kumar Harshith Gowda K S (harsh-3125) wrote :

Hi dbungert,

For our Software RAID case, it's still failing.
Attaching installer logs.

Revision history for this message
Kumar Harshith Gowda K S (harsh-3125) wrote :
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.