PXE boot loop never timed out with failed deployment

Bug #1918977 reported by Patricia Domingues
14
This bug affects 1 person
Affects Status Importance Assigned to Milestone
MAAS
Invalid
Undecided
Unassigned

Bug Description

Note: This issue is specifically about the fact that MAAS never timed out the deployment. The reason for the deployment failure itself is being tracked separately in bug 1918978.

This issue is happening with a HP Moonshot 1500 Chassis' m400 cartridge (arm64)
Trying to deploy, it got stuck on a PXE boot loop and it never timed out as expected with a failed deployment

```
2021-03-12T16:00:11.121058+00:00 maas maas.node: [info] ms10-34-mcdivittB0-kernel: Status transition from READY to ALLOCATED
2021-03-12T16:00:11.132363+00:00 maas maas.node: [info] ms10-34-mcdivittB0-kernel: allocated to user patriciasd
2021-03-12T16:01:04.275434+00:00 maas maas.node: [info] ms10-34-mcdivittB0-kernel: Status transition from READY to ALLOCATED
2021-03-12T16:01:04.285661+00:00 maas maas.node: [info] ms10-34-mcdivittB0-kernel: allocated to user patriciasd
2021-03-12T16:01:04.528627+00:00 maas maas.node: [info] ms10-34-mcdivittB0-kernel: Status transition from ALLOCATED to DEPLOYING
2021-03-12T16:01:08.336223+00:00 maas maas.power: [info] Changing power state (on) of node: ms10-34-mcdivittB0-kernel (7wbsag)
2021-03-12T16:01:22.716794+00:00 maas maas.power: [info] Changed power state (on) of node: ms10-34-mcdivittB0-kernel (7wbsag)
```

since then system is in a PXE boot loop:
```
rackd.log:2021-03-12 16:02:01 provisioningserver.rackdservices.tftp: [info] pxelinux.cfg/73F8F3BA-341C-581E-845B-73285A68CEBB requested by 10.229.49.234
rackd.log:2021-03-12 16:02:01 provisioningserver.rackdservices.tftp: [info] pxelinux.cfg/01-14-58-d0-58-93-92 requested by 10.229.49.234
rackd.log:2021-03-12 16:02:02 provisioningserver.rackdservices.tftp: [info] ubuntu/arm64/ga-16.04/xenial/stable/boot-initrd requested by 10.229.49.234
rackd.log:2021-03-12 16:02:19 provisioningserver.rackdservices.tftp: [info] ubuntu/arm64/ga-16.04/xenial/stable/boot-kernel requested by 10.229.49.234
```
...
2 hours after:
```
rackd.log:2021-03-12 18:15:56 provisioningserver.rackdservices.tftp: [info] pxelinux.cfg/01-14-58-d0-58-93-92 requested by 10.229.49.234
rackd.log:2021-03-12 18:15:56 provisioningserver.rackdservices.tftp: [info] ubuntu/arm64/ga-16.04/xenial/stable/boot-initrd requested by 10.229.49.234
rackd.log:2021-03-12 18:16:13 provisioningserver.rackdservices.tftp: [info] ubuntu/arm64/ga-16.04/xenial/stable/boot-kernel requested by 10.229.49.234
```
...
4 hours after:
```
2021-03-12 20:02:30 provisioningserver.rackdservices.tftp: [info] pxelinux.cfg/73F8F3BA-341C-581E-845B-73285A68CEBB requested by 10.229.49.234
2021-03-12 20:02:30 provisioningserver.rackdservices.tftp: [info] pxelinux.cfg/01-14-58-d0-58-93-92 requested by 10.229.49.234
2021-03-12 20:02:31 provisioningserver.rackdservices.tftp: [info] ubuntu/arm64/ga-16.04/xenial/stable/boot-initrd requested by 10.229.49.234
2021-03-12 20:02:47 provisioningserver.rackdservices.tftp: [info] ubuntu/arm64/ga-16.04/xenial/stable/boot-kernel requested by 10.229.49.234
```

on MAAS UI:
```
  Fri, 12 Mar. 2021 20:04:57 Performing PXE boot
  Fri, 12 Mar. 2021 20:04:27 Performing PXE boot
  Fri, 12 Mar. 2021 20:03:59 Performing PXE boot
  Fri, 12 Mar. 2021 20:03:29 Performing PXE boot
...
  Fri, 12 Mar. 2021 18:20:57 Performing PXE boot
  Fri, 12 Mar. 2021 18:20:27 Performing PXE boot
  Fri, 12 Mar. 2021 18:19:58 Performing PXE boot
  Fri, 12 Mar. 2021 18:19:28 Performing PXE boot
```

`MAAS version: 2.9.2 (9165-g.b5dc1fd6c)`

Revision history for this message
Patricia Domingues (patriciasd) wrote :

serial log of the system on a PXE boot loop

Revision history for this message
Patricia Domingues (patriciasd) wrote :

It seems related to the wrong subarchitecture configuration after the system was recommissioned "bug 1918978"

dann frazier (dannf)
description: updated
Revision history for this message
Alberto Donato (ack) wrote :

I think maas is doing the right thing here, because the machine keeps trying to PXE boot, and maas just reports that this is happening as if the deployment kept restarting.

It's a bit odd that the machine keeps retrying indefinitely to boot, but the real cause of the issue as mentioned is lp:1918978

Changed in maas:
status: New → Invalid
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Bug attachments

Remote bug watches

Bug watches keep track of this bug in other bug trackers.