Adding overlapping subnets in fabric breaks deployments
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
MAAS |
Triaged
|
Medium
|
Unassigned | ||
MAAS documentation |
Fix Released
|
High
|
Bill Wear |
Bug Description
Here is how to reproduce the issue:
1) MAAS 3.1 Snap with an existing 10.1.8.0/22 subnet configured + DHCP Provided by MAAS. Currently installed version on all region and rack controllers is 3.1.0-10901-
2) Performed a test commissioning and deployment prior to the experiment. Everything worked.
3) Added a new subnet 10.1.10.0/24 via WebUI to fabric-0 which already includes an existing overlapping subnet 10.1.8.0/22. MAAS did not stop me from adding the overlapping network.
4) Tested deploying an already commissioned machine:
- Edited the network interface and put it under the 10.1.10.0/24 as well as 10.1.8.0/22 subnets.
- Tried DHCP and static IP addresses.
- Tried Focal (20.04) and Groovy (20.10)
- In all scenarios the machine performed PXE boot then went into a boot loop causing the deployments to fail.
5) Removed the overlapping subnet and re-tested deployments, they still failed.
6) Rebooted all region (2) and rack (2) controllers.
7) Tested deployments again and they started working again.
Suggested Solution: Do not allow user to add overlapping subnets. This should be possible by implementing some sort of validation upon creating subnets.
Changed in maas-offline-docs: | |
status: | Triaged → Fix Committed |
status: | Fix Committed → Fix Released |
summary: |
- [3.1] Adding overlapping subnets in fabric breaks deployments + Adding overlapping subnets in fabric breaks deployments |
Triaging because I have already seen this weirdness. Subnets aren't intended to overlap. The IP range of one subnet should be unique compared to every other subnet on the same segment. This is mainly because routers can't reliably determine which subnet should get a packet destined for one of the overlapping addresses. That might be what's gumming up the rack controller in this instance, dunno.
That said, I'm not sure if MAAS should prevent you from doing it, that is, I'm not sure if it's a doc/troubleshooting bug or a code bug. Either way, we should talk about it.