lxd

hi1620-based ARM Servers are shown as "Unknown model"

Bug #1897946 reported by dann frazier
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
MAAS
Fix Released
Medium
Unassigned
2.9
Fix Released
Medium
Unassigned
kunpeng920
Fix Released
Undecided
Taihsiang Ho
lxd
New
Undecided
Unassigned

Bug Description

In the MAAS UI, ARM Servers based on the Hi1620 ARM SoC appear as "Unknown model". This seems odd for an Ubuntu-certified platform that exposes a proper DMI table. I suspect MAAS maybe scraping /proc/cpuinfo for this information, which isn't a standardized interface. Perhaps that may make sense as a fallback for arm64 systems that do not expose DMI info - but most all ARM servers will.

The following is from MAAS' commissioning info for one of these servers. Personally, I'd recommend using system-manufacturer and system-product-name - i.e. "Huawei XA320 V2":

-----BEGIN DMI KEYPAIRS-----
bios-vendor=Huawei Corp.
bios-version=0.95
bios-release-date=08/15/2019
system-manufacturer=Huawei
system-product-name=XA320 V2
system-version=To be filled by O.E.M.
system-serial-number=To be filled by O.E.M.
system-uuid=E1C5D866-0018-A26E-B211-D21DAC4F182F
baseboard-manufacturer=Huawei
baseboard-product-name=BC82HPNB
baseboard-version=V200R002C00
baseboard-serial-number=025SVU10K5000010
baseboard-asset-tag=To be filled by O.E.M.
chassis-manufacturer=Huawei
chassis-type=Main Server Chassis
chassis-version=To be filled by O.E.M.
chassis-serial-number=To be filled by O.E.M.
chassis-asset-tag=To be filled by O.E.M.
processor-family=ARM
ARM
processor-manufacturer=Hisilicon
Hisilicon
processor-version=Kunpeng 920-4826
Kunpeng 920-4826
processor-frequency=2600 MHz
2600 MHz
-----END DMI KEYPAIRS-----

Revision history for this message
dann frazier (dannf) wrote :

Actually, I see now that this is supposed to be a descriptor of the CPU, so processor-version might be a better field.

Revision history for this message
dann frazier (dannf) wrote :

Also, for a Cavium ThunderX2 Saber board, processor-version would be:

processor-version=Cavium ThunderX2(R) CPU CN9975 v2.1 @ 2.20GHz Cavium ThunderX2(R) CPU CN9975 v2.1 @ 2.20GHz

Currently MAAS shows this as a "ThunderX2 CN99XX".

Revision history for this message
Lee Trager (ltrager) wrote :

MAAS actually gets its hardware information from LXD. During commissioning MAAS uses a binary which is just the hardware information gathering portions of LXD. You can get identical information by running

curl -G --unix-socket "/var/snap/lxd/common/lxd/unix.socket" "lxd/1.0/resources" 2>/dev/null | jq

Please file a bug with LXD on github and link it to this bug. Once that is solved I can update the LXD sources we pull.

Changed in maas:
status: New → Triaged
importance: Undecided → Medium
milestone: none → 2.9.0b4
Revision history for this message
dann frazier (dannf) wrote :

Thanks @Lee. I'm confused though because - even though MAAS reports a Cavium Saber as a "ThunderX2 CN99XX" - I don't see the string "Thunder" at all in the lxd output on that platform:

$ curl -G --unix-socket "/var/snap/lxd/common/lxd/unix.socket" "lxd/1.0/resources" | grep -i thunder
  % Total % Received % Xferd Average Speed Time Time Time Current
                                 Dload Upload Total Spent Left Speed
100 32741 0 32741 0 0 115k 0 --:--:-- --:--:-- --:--:-- 115k

Revision history for this message
Lee Trager (ltrager) wrote :

Is this a long running MAAS? About 2 years ago the decision was made all Canonical products should gather hardware information from LXD. We implemented this for MAAS 2.7. The code[1] we use to parse LXD information may leave in old values if LXD doesn't come up with anything.

We also collect FRUID data for Facebook Wedge switches[2]. I've never seen that detect anything on hardware besides Facebook Wedge but this could be a first. Does the commissioning script "maas-get-fruid-api-data" contain anything?

[1] https://git.launchpad.net/maas/tree/src/metadataserver/builtin_scripts/hooks.py#n412
[2] https://git.launchpad.net/maas/tree/src/metadataserver/builtin_scripts/hooks.py#n840

Revision history for this message
dann frazier (dannf) wrote : Re: [Bug 1897946] Re: hi1620-based ARM Servers are shown as "Unknown model"

On Wed, Sep 30, 2020 at 5:25 PM Lee Trager <email address hidden> wrote:
>
> Is this a long running MAAS? About 2 years ago the decision was made all
> Canonical products should gather hardware information from LXD. We
> implemented this for MAAS 2.7. The code[1] we use to parse LXD
> information may leave in old values if LXD doesn't come up with
> anything.

Yes, definitely installed pre-2.7. What fields from LXD does MAAS use?
How would I see what MAAS would select if the install was new -
re-commission? Or would I need to delete/re-enlist?

> We also collect FRUID data for Facebook Wedge switches[2]. I've never
> seen that detect anything on hardware besides Facebook Wedge but this
> could be a first. Does the commissioning script "maas-get-fruid-api-
> data" contain anything?

It does not.

  -dann

Revision history for this message
Lee Trager (ltrager) wrote :

This[1] is the code MAAS uses. To see the difference you would need to delete the machine and readd it.

Most system information comes from

curl -G --unix-socket "/var/snap/lxd/common/lxd/unix.socket" "lxd/1.0/resources" 2>/dev/null | jq .metadata.system

CPU information comes from

curl -G --unix-socket "/var/snap/lxd/common/lxd/unix.socket" "lxd/1.0/resources" 2>/dev/null | jq .metadata.cpu

If the information coming from LXD isn't correct it will have to be fixed in LXD.

[1] https://git.launchpad.net/maas/tree/src/metadataserver/builtin_scripts/hooks.py#n412

Revision history for this message
Ike Panhc (ikepanhc) wrote :

Just some more information. On arm64 MAAS deploy machine, lxd is not installed by default. Checked bionic/focal daily MAAS arm64 images and it is empty in /var/snap. No snap installation is observed during commissioning.

Revision history for this message
Ike Panhc (ikepanhc) wrote :

It's also empty in /var/snap of amd64 squashfs bionic image. Maybe I look at wrong direction.

Revision history for this message
Lee Trager (ltrager) wrote :

MAAS doesn't use the LXD Snap. MAAS has a small Go binary[1] which imports the resource libraries from LXD and outputs to stdout. During commissioning this binary is downloaded and run[2].

The data must come from LXD, if LXD doesn't provide it there is no way for MAAS to get it.

[1] https://git.launchpad.net/maas/tree/src/machine-resources
[2] https://git.launchpad.net/maas/tree/src/metadataserver/builtin_scripts/commissioning_scripts/50-maas-01-commissioning

Lee Trager (ltrager)
Changed in maas:
milestone: 2.9.0b4 → 2.9.0b7
Revision history for this message
dann frazier (dannf) wrote :

fyi, I've collected lxd resource JSON from a number of ARM servers, and plan to see if there's an existing appropriate field I can recommend, or if we should flag a need for a new one.

Lee Trager (ltrager)
Changed in maas:
milestone: 2.9.0b7 → 2.9.x
Revision history for this message
dann frazier (dannf) wrote :
Changed in kunpeng920:
assignee: nobody → Taihsiang Ho (taihsiangho)
Revision history for this message
Taihsiang Ho (taihsiangho) wrote :

I checked the JSON provided by Dann in comment#12
https://bugs.launchpad.net/kunpeng920/+bug/1897946/comments/12 and these
json files show:

    - For platform vendor name and platform product name they are all good.
All platform vendor and product names show correctly (all fields have
values) and consistent with the information provided by the json files,
fetched by "curl -G --unix-socket "/var/snap/lxd/common/lxd/unix.socket"
"lxd/1.0/resources" 2>/dev/null | jq .metadata.system.vendor" and "curl -G
--unix-socket "/var/snap/lxd/common/lxd/unix.socket" "lxd/1.0/resources"
2>/dev/null | jq .metadata.system.product"
    - For cpus of all platforms we don't have luck. All cpu names and cpu
vendor names for each platform are missing.

I checked the cpu informations with the following commands:

    - to show cpu name: jq .metadata.cpu.sockets[0].name file.json
    - to show cpu vendor name: jq .metadata.cpu.sockets[0].vendor file.json

and all of the platforms return "null" for their cpu name and cpu vendor
name. I expect they should have values[1].

I guess our next is to report the issue against LXD, keep MAAS "Triaged" as
is, and thanks Lee's informative help.

[1] For instance, a similar example tried on my laptop will show:

$ curl -G --unix-socket "/var/snap/lxd/common/lxd/unix.socket"
"lxd/1.0/resources" 2>/dev/null | jq .metadata.cpu

  <...skipped...>
  "sockets": [
    {
      "name": "Intel(R) Core(TM) i5-8265U CPU @ 1.60GHz",
      "vendor": "GenuineIntel",
  <...skipped...>

On Fri, Oct 16, 2020 at 11:15 PM dann frazier <email address hidden>
wrote:

> fyi, I've collected lxd resource JSON from a number of ARM servers, and
> plan to see if there's an existing appropriate field I can recommend, or
> if we should flag a need for a new one.
>
> --
> You received this bug notification because you are a member of The
> Pearl2 Team, which is subscribed to kunpeng920.
> https://bugs.launchpad.net/bugs/1897946
>
> Title:
> hi1620-based ARM Servers are shown as "Unknown model"
>
> To manage notifications about this bug go to:
> https://bugs.launchpad.net/kunpeng920/+bug/1897946/+subscriptions
>

tags: added: tairadar
Revision history for this message
Taihsiang Ho (taihsiangho) wrote :

LXD issue ticket on github https://github.com/lxc/lxd/issues/8339

Revision history for this message
Taihsiang Ho (taihsiangho) wrote :

The fix from LXD is expected to land in lxd-4.11. See https://github.com/lxc/lxd/issues/8339 .

tags: removed: tairadar
Changed in maas:
status: Triaged → Invalid
Changed in kunpeng920:
status: New → Fix Committed
Revision history for this message
Taihsiang Ho (taihsiangho) wrote :

Please note lxd-4.11 is released today https://discuss.linuxcontainers.org/t/lxd-4-11-has-been-released/10135

Let's see if the latest lxd release will help.

Revision history for this message
Andrew Cloke (andrew-cloke) wrote :

We now need to wait for a MAAS release to pick up the latest LXD 4.11.

Changed in maas:
status: Invalid → Fix Committed
milestone: 2.9.2 → 2.4.3
milestone: 2.4.3 → 2.10-beta1
Changed in maas:
status: Fix Committed → Fix Released
Revision history for this message
Taihsiang Ho (taihsiangho) wrote :

MAAS 3.0-beta1 was released a few hours ago.

Here is our target MAAS snap status for us to keep an eye on the update timing for our lab:

```
tracking: 2.9/stable
refresh-date: 43 days ago, at 22:14 UTC
channels:
  2.9/stable: 2.9.2-9164-g.ac176b5c4 2021-02-17 (11851) 150MB -
  2.9/candidate: 2.9.2-9165-g.c3e7848d1 2021-03-22 (12555) 149MB -
  2.9/beta: ↑
  2.9/edge: 2.9.3~alpha1-9185-g.626d8924c 2021-03-30 (12691) 158MB -
  latest/stable: –
  latest/candidate: –
  latest/beta: –
  latest/edge: 3.0.0~beta1-9743-g.7dfe1138c 2021-03-31 (12770) 158MB -
  3.0/stable: –
  3.0/candidate: –
  3.0/beta: 3.0.0~beta1-9736-g.d152c0bd4 2021-03-31 (12731) 158MB -
  3.0/edge: 3.0.0~beta1-9745-g.2de066154 2021-03-31 (12779) 158MB -
  2.8/stable: 2.8.4-8597-g.05313b458 2021-03-03 (12118) 138MB -
  2.8/candidate: 2.8.5-8600-g.efb54078a 2021-03-29 (12665) 135MB -
  2.8/beta: ↑
  2.8/edge: 2.8.5-8600-g.efb54078a 2021-03-29 (12665) 135MB -
  2.7/stable: 2.7.3-8290-g.ebe2b9884 2020-08-21 (8724) 144MB -
  2.7/candidate: ↑
  2.7/beta: ↑
  2.7/edge: 2.7.3-8297-g.68a767295 2021-02-16 (11806) 143MB -
installed: 2.9.2-9164-g.ac176b5c4 (11851) 150MB -
```

Revision history for this message
Andrew Cloke (andrew-cloke) wrote :

Is there a plan to revise MAAS 2.9 to pick up lxd-4.11? If so, is there a target date?

Revision history for this message
Andrew Cloke (andrew-cloke) wrote :

From brief discussions with the MAAS team, the focus is currently on releasing MAAS 3.0, and then they will look at maintenance updates to older releases such as 2.9.

Revision history for this message
Taihsiang Ho (taihsiangho) wrote :

The current maas 2.9.2 (9165-g.c3e7848d1) we are using has not picked up the fix yet.

Revision history for this message
Andrew Cloke (andrew-cloke) wrote :

Leaving open while waiting for backport to 2.9.2.

Revision history for this message
dann frazier (dannf) wrote :

fyi, I found that this is now fixed in the 2.9.3-beta1 snap. After recommissioning a Hi1620-based x6000, MAAS now reports "CPU 96 cores Kunpeng 920-4826"

dann frazier (dannf)
Changed in kunpeng920:
status: Fix Committed → Fix Released
Revision history for this message
Taihsiang Ho (taihsiangho) wrote :

Cool! Thanks @Dann for the update!

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.