sunbeam launch failed with port binding errors

Bug #2023931 reported by Hemanth Nakkina
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Snap
Incomplete
Medium
Hemanth Nakkina

Bug Description

sunbeam launch failed with port binding error

nova-compute log:
Jun 13 21:55:37 dev openstack-hypervisor.nova-compute[538724]: nova.exception.PortBindingFailed: Binding failed for port 56e88a88-6832-4783-b91d-ad8b5e0a7103, please check neutron logs for more information.

neutron server log:
2023-06-14 16:10:42.899 56 ERROR neutron.plugins.ml2.managers [req-8c0a8453-0749-4bc4-a0da-246022c66180 req-a9bf51ab-535a-4c24-ae17-f23ef8bb9304 63a4f9b6c11c42a8b627cf71d279bfc8 79e7828473a74ad28fc28040aa2e0338 - - 0795e7024e5e49f1a957a669cc4552f6 0795e7024e5e49f1a957a669cc4552f6] Failed to bind port 0452926d-f609-4211-a608-ffc36403499b on host dev.internal.cloudapp.net for vnic_type normal using segments [{'id': '5f6ba483-42f1-4104-9b97-d610cdbfe78f', 'network_type': 'geneve', 'physical_network': None, 'segmentation_id': 1184, 'network_id': '3c87e071-e1af-4ba9-b446-675e4eff92ae'}]

Saw the following warnings:
2023-06-13 21:55:36.443 54 WARNING neutron.plugins.ml2.drivers.ovn.mech_driver.mech_driver [req-9174f292-1e12-4e6c-a26f-38f3a3d9d83d req-00b772fb-a896-4d55-b0e7-5c18539d99d5 63a4f9b6c11c42a8b627cf71d279bfc8 79e7828473a74ad28fc28040aa2e0338 - - 0795e7024e5e49f1a957a669cc4552f6 0795e7024e5e49f1a957a669cc4552f6] Refusing to bind port 56e88a88-6832-4783-b91d-ad8b5e0a7103 due to no OVN chassis for host: dev.internal.cloudapp.net

The above warning clearly shows the OVN chassis does not exist. Seems there is difference in hostname perceived by nova-compute and ovn-controller.

More information:

python3 -c "import socket; print(socket.getfqdn())"
dev.internal.cloudapp.net

azureuser@dev:~$ openstack hypervisor list
+--------------------------------------+---------------------------------------------------------+-----------------+--------------+-------+

| ID | Hypervisor Hostname | Hypervisor Type | Host IP | State |

+--------------------------------------+---------------------------------------------------------+-----------------+--------------+-------+

| 2c41448c-ab3d-454c-a4b8-a822ad522ab5 | dev.3pmbhi1rcrau3nnvk2nd1bwztb.ax.internal.cloudapp.net | QEMU | x.x.x.x | up |

+--------------------------------------+---------------------------------------------------------+-----------------+--------------+-------+

azureuser@dev:~$ hostname -f
dev.3pmbhi1rcrau3nnvk2nd1bwztb.ax.internal.cloudapp.net

azureuser@dev:~$ sunbeam cluster list

┏━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━┳━━━━━━━━━┳━━━━━━━━━┳━━━━━━━━━┓

┃ Node ┃ Status ┃ Control ┃ Compute ┃ Storage ┃

┡━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━╇━━━━━━━━━╇━━━━━━━━━╇━━━━━━━━━┩

│ dev.internal.cloudapp.net │ up │ x │ x │ │

└───────────────────────────┴────────┴─────────┴─────────┴─────────┘

azureuser@dev:~$ sudo snap get openstack-hypervisor node
Key Value
node.fqdn dev.internal.cloudapp.net

node.ip-address x.x.x.x

getfqdn() --> returns dev.internal.cloudapp.net
hostname -f --> returns dev.3pmbhi1rcrau3nnvk2nd1bwztb.ax.internal.cloudapp.net

And I see following message with <HOSTNAME>:<NODENAME> different
Jun 13 21:47:47 dev nova-compute[538724]: 2023-06-13 21:47:47.330 538724 INFO nova.compute.resource_tracker [None req-5eff8636-991c-401f-9d96-e2c8b29144f0 - - - - - -] Compute node record created for dev.internal.cloudapp.net:dev.3pmbhi1rcrau3nnvk2nd1bwztb.ax.internal.cloudapp.net with uuid: 2c41448c-ab3d-454c-a4b8-a822ad522ab5

FQDN used in sunbeam should do some more checks to avoid these kind of situations.

Revision history for this message
James Page (james-page) wrote :

The snap explicitly sets the internal host in nova, but not in neutron.

So nova gets socket.fqdn whereas neutron defaults to socket.get_hostname

Revision history for this message
James Page (james-page) wrote :

Actually that's foobar - OVN is set to socket.getfqdn as is the host configuration in nova.conf.

So they should match

Revision history for this message
James Page (james-page) wrote :

Can you check what the host key is set to in:

/var/snap/openstack-hypervisor/common/etc/nova/nova.conf

please

Changed in snap-openstack:
status: New → Incomplete
importance: Undecided → Medium
Revision history for this message
Hemanth Nakkina (hemanth-n) wrote :

nova.conf has the following configuration for host key
host = dev.internal.cloudapp.net

From the below log, <host>:<nodename> --> host name is populated from host key in conf (via sunbeam) and nodename (hypervisor_hostname) is populated from libvirt hostinfo (libvirt uses gethostname i guess). And there comes the discrepancy.

Jun 13 21:47:47 dev nova-compute[538724]: 2023-06-13 21:47:47.330 538724 INFO nova.compute.resource_tracker [None req-5eff8636-991c-401f-9d96-e2c8b29144f0 - - - - - -] Compute node record created for dev.internal.cloudapp.net:dev.3pmbhi1rcrau3nnvk2nd1bwztb.ax.internal.cloudapp.net with uuid: 2c41448c-ab3d-454c-a4b8-a822ad522ab5

Revision history for this message
Hemanth Nakkina (hemanth-n) wrote :
Changed in snap-openstack:
assignee: nobody → Hemanth Nakkina (hemanth-n)
Revision history for this message
nikhil kshirsagar (nkshirsagar) wrote :
Download full text (5.7 KiB)

I'm running into a very similar issue.

ubuntu@crustle:~$ source demo-openrc
ubuntu@crustle:~$ python3 -c "import socket; print(socket.getfqdn())"
crustle.segmaas.1ss
ubuntu@crustle:~$ openstack hypervisor list
HttpException: 403: Client Error for url: http://10.20.21.11/openstack-nova/v2.1/os-hypervisors/detail, Policy doesn't allow os_compute_api:os-hypervisors:list-detail to be performed.
ubuntu@crustle:~$ hostname -f
crustle.segmaas.1ss
ubuntu@crustle:~$ sunbeam cluster list
┏━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━┳━━━━━━━━━┳━━━━━━━━━┳━━━━━━━━━┓
┃ Node ┃ Status ┃ Control ┃ Compute ┃ Storage ┃
┡━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━╇━━━━━━━━━╇━━━━━━━━━╇━━━━━━━━━┩
│ crustle.segmaas.1ss │ up │ x │ x │ │
└─────────────────────┴────────┴─────────┴─────────┴─────────┘
ubuntu@crustle:~$ sudo snap get openstack-hypervisor node
Key Value
node.fqdn crustle.segmaas.1ss
node.ip-address 10.230.57.128
ubuntu@crustle:~$

Attaching full nova compute logs from journalctl. nova logs at https://pastebin.canonical.com/p/3WvYC3dX9T/ , some logs from the charm container at https://pastebin.canonical.com/p/wBtCTxFJ5k/ and neutron-server logs at https://pastebin.canonical.com/p/6PnfTKPG5P/

https://pastebin.canonical.com/p/xdk4knBGj2/ has the relevant logs in journalctl when the vm creation was tried using a command like,

openstack server create --flavor m1.small --image "ubuntu" testvm-nikhil

Or even this approach.

ubuntu@crustle:~$ sunbeam cluster list
┏━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━┳━━━━━━━━━┳━━━━━━━━━┳━━━━━━━━━┓
┃ Node ┃ Status ┃ Control ┃ Compute ┃ Storage ┃
┡━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━╇━━━━━━━━━╇━━━━━━━━━╇━━━━━━━━━┩
│ crustle.segmaas.1ss │ up │ x │ x │ │
└─────────────────────┴────────┴─────────┴─────────┴─────────┘
ubuntu@crustle:~$ sunbeam launch ubuntu --name test
Launching an OpenStack instance ...
⠦ Creating the OpenStack instance ... Instance creation request failed: Server:4dee2c40-13a6-4b21-8ffc-867d4ed02a77 transitioned to failure state ERROR
Error: Unable to request new instance. Please run `sunbeam configure` first.
ubuntu@crustle:~$ sunbeam openrc > admin-openrc
ubuntu@crustle:~$ sunbeam configure --accept-defaults --openrc admin-openrc
Writing openrc to admin-openrc ... done
ubuntu@crustle:~$ sunbeam launch ubuntu --name test
Launching an OpenStack instance ...
Found sunbeam key in OpenStack!
⠸ Creating the OpenStack instance ... Instance creation request failed: Server:c0f5a495-1af3-40b4-b583-dbdc27fc5393 transitioned to failure state ERROR
Error: Unable to request new instance. Please run `sunbeam configure` first.
ubuntu@crustle:~$ sunbeam configure
Local or remote access to VMs [local/remote] (local):
CIDR of OpenStack external network - arbitrary but must not be in use (10.20.20.0/24):
Populate OpenStack cloud with demo user, default images, flavors etc [y/n] (y):
Username to use for access to OpenStack (demo):
Password to use for access to OpenStack (T3********):
Network range to use for project network (192.168.122.0/24):
List of nameservers guests should use for DNS resolution (10.230.56.2):
Enable ping and SSH access to instances? [y/n]...

Read more...

Revision history for this message
nikhil kshirsagar (nkshirsagar) wrote :
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.