sunbeam launch failed with port binding error
nova-compute log:
Jun 13 21:55:37 dev openstack-hypervisor.nova-compute[538724]: nova.exception.PortBindingFailed: Binding failed for port 56e88a88-6832-4783-b91d-ad8b5e0a7103, please check neutron logs for more information.
neutron server log:
2023-06-14 16:10:42.899 56 ERROR neutron.plugins.ml2.managers [req-8c0a8453-0749-4bc4-a0da-246022c66180 req-a9bf51ab-535a-4c24-ae17-f23ef8bb9304 63a4f9b6c11c42a8b627cf71d279bfc8 79e7828473a74ad28fc28040aa2e0338 - - 0795e7024e5e49f1a957a669cc4552f6 0795e7024e5e49f1a957a669cc4552f6] Failed to bind port 0452926d-f609-4211-a608-ffc36403499b on host dev.internal.cloudapp.net for vnic_type normal using segments [{'id': '5f6ba483-42f1-4104-9b97-d610cdbfe78f', 'network_type': 'geneve', 'physical_network': None, 'segmentation_id': 1184, 'network_id': '3c87e071-e1af-4ba9-b446-675e4eff92ae'}]
Saw the following warnings:
2023-06-13 21:55:36.443 54 WARNING neutron.plugins.ml2.drivers.ovn.mech_driver.mech_driver [req-9174f292-1e12-4e6c-a26f-38f3a3d9d83d req-00b772fb-a896-4d55-b0e7-5c18539d99d5 63a4f9b6c11c42a8b627cf71d279bfc8 79e7828473a74ad28fc28040aa2e0338 - - 0795e7024e5e49f1a957a669cc4552f6 0795e7024e5e49f1a957a669cc4552f6] Refusing to bind port 56e88a88-6832-4783-b91d-ad8b5e0a7103 due to no OVN chassis for host: dev.internal.cloudapp.net
The above warning clearly shows the OVN chassis does not exist. Seems there is difference in hostname perceived by nova-compute and ovn-controller.
More information:
python3 -c "import socket; print(socket.getfqdn())"
dev.internal.cloudapp.net
azureuser@dev:~$ openstack hypervisor list
+--------------------------------------+---------------------------------------------------------+-----------------+--------------+-------+
| ID | Hypervisor Hostname | Hypervisor Type | Host IP | State |
+--------------------------------------+---------------------------------------------------------+-----------------+--------------+-------+
| 2c41448c-ab3d-454c-a4b8-a822ad522ab5 | dev.3pmbhi1rcrau3nnvk2nd1bwztb.ax.internal.cloudapp.net | QEMU | x.x.x.x | up |
+--------------------------------------+---------------------------------------------------------+-----------------+--------------+-------+
azureuser@dev:~$ hostname -f
dev.3pmbhi1rcrau3nnvk2nd1bwztb.ax.internal.cloudapp.net
azureuser@dev:~$ sunbeam cluster list
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━┳━━━━━━━━━┳━━━━━━━━━┳━━━━━━━━━┓
┃ Node ┃ Status ┃ Control ┃ Compute ┃ Storage ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━╇━━━━━━━━━╇━━━━━━━━━╇━━━━━━━━━┩
│ dev.internal.cloudapp.net │ up │ x │ x │ │
└───────────────────────────┴────────┴─────────┴─────────┴─────────┘
azureuser@dev:~$ sudo snap get openstack-hypervisor node
Key Value
node.fqdn dev.internal.cloudapp.net
node.ip-address x.x.x.x
getfqdn() --> returns dev.internal.cloudapp.net
hostname -f --> returns dev.3pmbhi1rcrau3nnvk2nd1bwztb.ax.internal.cloudapp.net
And I see following message with <HOSTNAME>:<NODENAME> different
Jun 13 21:47:47 dev nova-compute[538724]: 2023-06-13 21:47:47.330 538724 INFO nova.compute.resource_tracker [None req-5eff8636-991c-401f-9d96-e2c8b29144f0 - - - - - -] Compute node record created for dev.internal.cloudapp.net:dev.3pmbhi1rcrau3nnvk2nd1bwztb.ax.internal.cloudapp.net with uuid: 2c41448c-ab3d-454c-a4b8-a822ad522ab5
FQDN used in sunbeam should do some more checks to avoid these kind of situations.
The snap explicitly sets the internal host in nova, but not in neutron.
So nova gets socket.fqdn whereas neutron defaults to socket.get_hostname