IPv6: All hosts remain offline after booting off the controller-0
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
StarlingX |
Fix Released
|
Critical
|
Yue Tao |
Bug Description
Brief Description
-----------------
controller-1 and worker nodes remained offline after PXE from controller-0 install. This is seen on a multi-node system w/ IPv6
For some reason the hosts mgmt. address has incorrect CIDR
10: vlan133@ens801f0: <BROADCAST,
It should have been inet6 fd01:1::4/64.
Severity
--------
Critical
Steps to Reproduce
------------------
On a multi-node system w/ IPv6 config:
- Install controller-0
- Install/Configure the rest of the nodes
Expected Behavior
------------------
All nodes should install successfully and go to an online state
Actual Behavior
----------------
Controller-1 and worker nodes install, but remain in an offline state
Reproducibility
---------------
100% Reproducible on multi-node systems w/ IPv6 config
System Configuration
-------
multi-node systems w/ IPv6 config
Branch/Pull Time/Commit
-------
stx master: Feb 1, 2021
Last Pass
---------
stx master: January 18, 2021
Note: There were build failures due to CENGN corruption from Jan 21 to Jan 28, hence the gap between the last pass and when the issue was discovered.
Timestamp/Logs
--------------
Issue is reproducible
Test Activity
-------------
Sanity
Workaround
----------
None
CVE References
- 2016-10739
- 2017-6519
- 2018-10360
- 2018-1116
- 2018-1122
- 2018-12404
- 2018-1312
- 2018-13139
- 2018-14348
- 2018-14498
- 2018-15473
- 2018-17199
- 2018-18384
- 2018-19519
- 2018-4700
- 2018-5741
- 2018-5742
- 2018-5743
- 2018-8905
- 2019-0220
- 2019-10160
- 2019-10218
- 2019-11068
- 2019-11745
- 2019-12735
- 2019-13232
- 2019-13734
- 2019-16056
- 2019-17006
- 2019-18634
- 2019-3813
- 2019-3880
- 2019-5482
- 2019-6470
- 2019-6477
- 2019-9636
- 2019-9924
- 2019-9948
- 2020-0549
- 2020-10772
- 2020-10878
- 2020-12049
- 2020-12663
- 2020-13817
- 2020-15705
- 2020-15707
- 2020-5208
- 2020-6851
- 2020-8112
- 2020-8617
- 2021-26937
- 2021-3156
Adding investigation by Don Penney:
The nodes are installing fine, and the kickstarts are generating initial network-scripts with BOOTPROTO=dhcp. On the initial boot of the node post-install, the interface configuration is then done via DHCP.
The DHCP package was recently upversioned, Jan 22, from 4.2.5-68 to 4.2.5-82: /review. opendev. org/c/starlingx /integ/ +/771752 /review. opendev. org/c/starlingx /tools/ +/771744 (dependeny update)
https:/
https:/
Perhaps something in this update is resulting in the resulting DHCP address having /128