[Focal] Installation Fails with "Invalid dep_id" when using Intel VROC

Bug #1885191 reported by Pedro Principeza
56
This bug affects 10 people
Affects Status Importance Assigned to Milestone
subiquity (Ubuntu)
Fix Released
Undecided
Unassigned
Focal
Fix Released
Undecided
Unassigned
Impish
Fix Released
Undecided
Unassigned

Bug Description

[Description]
On attempting to install Ubuntu 20.04 at a server using the Intel VROC (Virtual RAID on CPU) [0] technology, upon selecting the md-XXX device to partition, the installer fails and loops the user back to device selection.

The errors seen in the subiquity-debug logs under /var/log/installer outline:

2020-05-29 17:31:15,172 INFO subiquity.core:477 saving crash report 'block probing crashed with ValueError' to /var/crash/1590773475.167309046.disk_probe_fail.crash
2020-05-29 17:54:59,844 INFO curtin:1319 Validating extracted storage config components
2020-05-29 17:54:59,871 ERROR root:39 finish: subiquity/Filesystem/_probe/probe_once: FAIL: Invalid dep_id (disk-md126) not in storage config
2020-05-29 17:54:59,871 ERROR block-discover:161 block probing failed restricted=False

The server in question is a SuperMicro SSG-2029P-ACR24H, but any server in which an Intel VROC device is used should reproduce this.

Also, updating to the latest Subiquity version before moving on with the installation process results in the same error.

At last, it is useful to state that using Bionic's Live media (subiquity-based) results in the same block-discover failure.

[Impact]
Users are unable to use any Subiquity-based Ubuntu installers.

[Reproducer]
- On a server with Intel VROC support, configure the use of such with these instructions (mileage may vary from server to server) [1];
- Start a Subiquity-based Ubuntu installation (Focal or Live Bionic);
- Drive up to the part of selecting a disk to install the OS, select the VROC md array, and check the block-discover error to be posted in screen/subiquity logs.

[Workaround]
Users attempting to install Focal must use Bionic's non-live installer (d-i based) to install 18.04 and use do-release-upgrade to migrate to Focal. The d-i installer has the mdadm patches needed to support Intel VROC, but it seems Subiquity and/or Curtin don't seem to know how to handle it.

I'm attaching a full bundle of /var/log/installer from a Focal installation failure.

[0] https://www.intel.com/content/www/us/en/support/articles/000024498/memory-and-storage/ssd-software.html
[1] https://www.intel.com/content/dam/support/us/en/documents/memory-and-storage/ssd-software/VROC-Ubuntu-Setup-UserGuide-342787-US.pdf

Revision history for this message
Pedro Principeza (pprincipeza) wrote :
Revision history for this message
Rafael Leira Osuna (ralequi) wrote :

I can confirm the bug on our servers and also that the workaround actually works.

Also I attach the logs.

Maybe this is the most interesting part:

Traceback (most recent call last):
  File "/snap/subiquity/1966/lib/python3.6/site-packages/subiquity/controllers/filesystem.py", line 154, in _probe
    await asyncio.wait_for(self._probe_once_task.task, 15.0)
  File "/snap/subiquity/1966/usr/lib/python3.6/asyncio/tasks.py", line 358, in wait_for
    return fut.result()
  File "/snap/subiquity/1966/lib/python3.6/site-packages/subiquitycore/context.py", line 142, in decorated_async
    return await meth(self, **kw)
  File "/snap/subiquity/1966/lib/python3.6/site-packages/subiquity/controllers/filesystem.py", line 135, in _probe_once
    self.model.load_probe_data(storage)
  File "/snap/subiquity/1966/lib/python3.6/site-packages/subiquity/models/filesystem.py", line 1633, in load_probe_data
    self.reset()
  File "/snap/subiquity/1966/lib/python3.6/site-packages/subiquity/models/filesystem.py", line 1351, in reset
    self._probe_data)["storage"]["config"]
  File "/snap/subiquity/1966/lib/python3.6/site-packages/curtin/storage_config.py", line 1348, in extract_storage_config
    tree = get_config_tree(cfg.get('id'), final_config)
  File "/snap/subiquity/1966/lib/python3.6/site-packages/curtin/storage_config.py", line 306, in get_config_tree
    for dep in find_item_dependencies(item, sconfig):
  File "/snap/subiquity/1966/lib/python3.6/site-packages/curtin/storage_config.py", line 276, in find_item_dependencies
    _validate_dep_type(item_id, dep_key, dep, config)
  File "/snap/subiquity/1966/lib/python3.6/site-packages/curtin/storage_config.py", line 224, in _validate_dep_type
    'Invalid dep_id (%s) not in storage config' % dep_id)
ValueError: Invalid dep_id (disk-md127) not in storage config

Revision history for this message
Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in subiquity (Ubuntu):
status: New → Confirmed
tags: added: vroc
Revision history for this message
Anton Lindström (antonlindstrom) wrote :

Another workaround is to use the legacy server install image for 20.04:

http://cdimage.ubuntu.com/ubuntu-legacy-server/releases/20.04.1/release/

Revision history for this message
Pawel Baldysiak (pbaldysi) wrote :

Hi,
What is the result of this issue?
On my VROC platform I observe that VROC RAID does not appear as install target at all (using live-installer). Mdadm assembles the RAID, but it does not appear on the list of available devices.
With d-i it is working fine.
Is the result the same in this defect, or there is some user visible error here?
I'm asking because I would like to know if I should file separate LP, or this is the issue.

Thanks
Pawel

Revision history for this message
Pawel Baldysiak (pbaldysi) wrote :

Are there any updates here?

Revision history for this message
Michael Hudson-Doyle (mwhudson) wrote :

Unfortunately not.

Jeff Lane  (bladernr)
tags: added: hwcert-server
Revision history for this message
Jeff Lane  (bladernr) wrote :

Quanxian, Pavel, Can these servers with VROC enabled/configured be installed using MAAS (https://maas.io)?

For server, while we do have an ISO installer in Subiquity, any time we do a deployment for customers the preferred method of managing and deploying DC hardware is MAAS. MAAS deployments do not use Subiquity, but rather use cloud-init to configure the machine once partitioning has been done and an OS image has been laid down on the storage devices.

So could you check this with MAAS and see if the same or different issues persist, and if so, we'll need a separate MAAS bug as well.

Revision history for this message
Jeff Lane  (bladernr) wrote :

Can we please target this for 21.10? This is being requested by several hardware partners.

Revision history for this message
Lee Azzarello (lee-rockingtiger) wrote :

I can confirm this same bug on a Supermicro 6019U-TN4RT with SATA disks on the VROC controller. Downgrading to the D-I based installer allowed me to use the MD device. Notably, launching a shell from the Subiquity installer and viewing the output of /proc/mdstat indicated the device is available and can be formatted with a filesystem. Not sure why this installer ignores it though.

I will also be installing a NVMe hardware key, which this controller requires to make a bootable RAID device on the PCIe lanes. I'll report back if this bug also effects this device type.

Revision history for this message
Jeff Lane  (bladernr) wrote :

Should be resolved with work in progress for Subiquity

Revision history for this message
Jeff Lane  (bladernr) wrote :

Can you please try the latest Impish images and see if the work to update the installer resolves this? (I'm not sure that work is complete and merged, to be honest, so please keep that in mind)

Changed in subiquity (Ubuntu Impish):
status: Confirmed → Incomplete
Revision history for this message
Michael Hudson-Doyle (mwhudson) wrote :

Everything we have is in both 20.04.3 release and impish dailies.

Changed in subiquity (Ubuntu Impish):
status: Incomplete → Fix Released
Changed in subiquity (Ubuntu Focal):
status: New → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.