controller subnet and IP address change not reflected in maas
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
MAAS |
Triaged
|
Medium
|
Unassigned |
Bug Description
MAAS snap version: 2.9.2 (9165-g.c3e7848d1)
After a change of subnet and IP addresses in a maas deployment, one of the controllers is still showing its old IP address in maas while the interface on the node shows the new IP address.
Subnet 10.80.0.0/24 was changed to 10.81.0.0/24.
regiond and rackd logs are attached.
15: br305: <BROADCAST,
link/ether 7a:d2:17:84:20:90 brd ff:ff:ff:ff:ff:ff
inet 10.81.0.12/24 brd 10.81.0.255 scope global br305
valid_lft forever preferred_lft forever
inet6 fe80::ec1f:
valid_lft forever preferred_lft forever
maas root rack-controller read output:
"system_id": "rpmpsr",
"ip_addresses": [
]
Chain of events:
- maas running on 3 infra nodes as part of an edge deployment
- a LXD cluster was manually created on those same 3 nodes
- the LXD cluster was added to maas as a vm-host (a vip is used to have a single vm-host representing the LXD cluster in maas)
- VMs were created on that vm-host without issues and a ceph pool was then added to the LXD cluster which was visible also in maas in the vm-host details
- subnet range change was then required for vlan 305 .i.e. change from 10.80.0.0/24 to 10.81.0.0/24. This subnet was part of fabric 0, and was assigned to a space already. The infra nodes each had an interface on the subnet as well as the already created VMs.
- the update on the nodes was made via netplan on each infra node and the subnet was adapted in the maas gui
- the already deployed VMs were deleted via the gui as they had to be re-deployed to pick up the new subnet
- when checking the IP addresses of the controllers in maas, infra2 still has the old IP address
- 3 VMs remained in error state, trying to delete them reported an error trying to contact the vm-host (however the VMs were correctly deleted from the LXD cluster)
- trying to perform a refresh of the vm-host did not work, it just tried for a long time and nothing happened
- trying to delete the vm-host resulted in the same behavior (both via GUI and CLI)
- trying to restart the maas snap did not help
- after a day, tried again to delete the vm-host and this time it worked. However in the logs we could still see connection attempts to the LXD cluster VIP (and a ceph pool error linked to that). But this also removed the VMs in error state.
- trying to add back the LXD cluster did not succeed, it just tries for a long time and nothing happens
- the LXD cluster looks functional, LXD CLI and LXD API access looks to be working fine.
added maas cli output for rack-controllers.