DPDK instances are failing to start: Failed to bind socket to /run/libvirt-vhost-user/vhu3ba44fdc-7c: No such file or directory

Bug #1943863 reported by Vladimir Grevtsev
10
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Nova Compute Charm
Invalid
Undecided
Unassigned
charm-layer-ovn
Fix Released
High
Liam Young
charm-ovn-chassis
Fix Released
High
Unassigned

Bug Description

== Env
focal/ussuri + ovn, latest stable charms
juju status: https://paste.ubuntu.com/p/2725tV47ym/
Hardware: Huawei CH121 V5 with MZ532,4*25GE Mezzanine Card,PCIE 3.0 X16 NICs + manually installed PMD for DPDK enablement (librte-pmd-hinic20.0 package)

== Problem description

DPDK instance can't be launched after the fresh deployment (focal/ussuri + OVN, latest stable charms), raising a below error:

$ os server show dpdk-test-instance -f yaml
OS-DCF:diskConfig: MANUAL
OS-EXT-AZ:availability_zone: ''
OS-EXT-SRV-ATTR:host: null
OS-EXT-SRV-ATTR:hypervisor_hostname: null
OS-EXT-SRV-ATTR:instance_name: instance-00000218
OS-EXT-STS:power_state: NOSTATE
OS-EXT-STS:task_state: null
OS-EXT-STS:vm_state: error
OS-SRV-USG:launched_at: null
OS-SRV-USG:terminated_at: null
accessIPv4: ''
accessIPv6: ''
addresses: ''
config_drive: 'True'
created: '2021-09-15T18:51:00Z'
fault:
  code: 500
  created: '2021-09-15T18:52:01Z'
  details: "Traceback (most recent call last):\n File \"/usr/lib/python3/dist-packages/nova/conductor/manager.py\"\
    , line 651, in build_instances\n scheduler_utils.populate_retry(\n File \"\
    /usr/lib/python3/dist-packages/nova/scheduler/utils.py\", line 919, in populate_retry\n\
    \ raise exception.MaxRetriesExceeded(reason=msg)\nnova.exception.MaxRetriesExceeded:\
    \ Exceeded maximum number of retries. Exceeded max scheduling attempts 3 for instance\
    \ 1bb2d1b7-e2e9-4d76-a346-a9b06ff22c73. Last exception: internal error: process\
    \ exited while connecting to monitor: 2021-09-15T18:51:53.485265Z qemu-system-x86_64:\
    \ -chardev socket,id=charnet0,path=/run/libvirt-vhost-user/vhu3ba44fdc-7c,server:\
    \ Failed to bind socket to /run/libvirt-vhost-user/vhu3ba44fdc-7c: No such file\
    \ or directory\n"
  message: 'Exceeded maximum number of retries. Exceeded max scheduling attempts 3
    for instance 1bb2d1b7-e2e9-4d76-a346-a9b06ff22c73. Last exception: internal error:
    process exited while connecting to monitor: 2021-09-15T18:51:53.485265Z qemu-system-x86_64:
    -chardev '
flavor: m1.medium.project.dpdk (4f452aa3-2b2c-4f2e-8465-5e3c2d8ec3f1)
hostId: ''
id: 1bb2d1b7-e2e9-4d76-a346-a9b06ff22c73
image: auto-sync/ubuntu-bionic-18.04-amd64-server-20210907-disk1.img (3851450e-e73d-489b-a356-33650690ed7a)
key_name: ubuntu-keypair
name: dpdk-test-instance
project_id: cdade870811447a89e2f0199373a0d95
properties: ''
status: ERROR
updated: '2021-09-15T18:52:01Z'
user_id: 13a0e7862c6641eeaaebbde1ae096f9e
volumes_attached: ''

For the record, a "generic" instances (e.g non-DPDK/non-SRIOV) are scheduling/starting without any issues.

== Steps to reproduce

openstack network create --external --provider-network-type vlan --provider-segment xxx --provider-physical-network dpdkfabric ext_net_dpdk
openstack subnet create --allocation-pool start=<redacted>,end=<redacted> --network ext_net_dpdk --subnet-range <redacted>/23 --gateway <redacted> --no-dhcp ext_net_dpdk_subnet

openstack aggregate create --zone nova dpdk
openstack aggregate set --property dpdk=true dpdk

openstack aggregate add host dpdk <fqdn>

openstack aggregate show dpdk --max-width=80

openstack flavor set --property aggregate_instance_extra_specs:dpdk=true --property hw:mem_page_size=large m1.medium.dpdk

openstack server create --config-drive true --network ext_net_dpdk --key-name ubuntu-keypair --image focal --flavor m1.medium.dpdk dpdk-test-instance

== Analysis
[before redeployment] nova-compute log : https://pastebin.canonical.com/p/FgPYNb3bPj/
[fresh deployment] juju crashdump: https://drive.google.com/file/d/1W_w3CAUq4ggp4alDnpCk08mSaCL6Uaxk/view?usp=sharing

<on hypervisor>

# ovs-vsctl get open_vswitch . other_config
{dpdk-extra="--pci-whitelist 0000:3e:00.0 --pci-whitelist 0000:40:00.0", dpdk-init="true", dpdk-lcore-mask="0x1000001", dpdk-socket-mem="4096,4096"}

# cat /etc/tmpfiles.d/nova-ovs-vhost-user.conf
# Create libvirt writeable directory for vhost-user sockets
d /run/libvirt-vhost-user 0770 libvirt-qemu kvm - -

In fact, none of the compute hosts have that file: https://paste.ubuntu.com/p/XJRFypbMQf/ (however, the error from this issue doesn't appear on non-DPDK hosts).

After doing the below command, that missing /run/... file has appeared and VM could have been scheduled and started. However, although it have been started, it wasn't reachable over the network.

# systemd-tmpfiles --create
# stat /run/libvirt-vhost-user
  File: /run/libvirt-vhost-user
  Size: 40 Blocks: 0 IO Block: 4096 directory

Revision history for this message
Vladimir Grevtsev (vlgrevtsev) wrote :

+ field-critical, as one of the core cloud functionalities is affected and there's no known workaround yet.

description: updated
Revision history for this message
Vladimir Grevtsev (vlgrevtsev) wrote :

ovs-vsctl show; sudo ovs-appctl bond/show dpdk-bond0: https://pastebin.canonical.com/p/jM6gzp2MX8/

Revision history for this message
Vladimir Grevtsev (vlgrevtsev) wrote :
Revision history for this message
Vladimir Grevtsev (vlgrevtsev) wrote :

Tried downgrading openvswitch debs: https://paste.ubuntu.com/p/fk2pfQyxDK/

After that, the original issue isn't there anymore (e.g DPDK instance could be scheduled). However, it still lacks a network connectivity.

What I've spotted is that after the VM creation, OVS service has been restarted (I guess it shouldn't do so?)

# service ovs-vswitchd status
● ovs-vswitchd.service - Open vSwitch Forwarding Unit
     Loaded: loaded (/lib/systemd/system/ovs-vswitchd.service; static; vendor preset: enabled)
     Active: activating (start) since Fri 2021-09-17 12:18:52 UTC; 4s ago
Cntrl PID: 936872 (ovs-ctl)
      Tasks: 16 (limit: 37276)
     Memory: 159.6M
     CGroup: /system.slice/ovs-vswitchd.service
             ├─936872 /bin/sh /usr/share/openvswitch/scripts/ovs-ctl --no-ovsdb-server --no-monitor --system-id=random --no-record-hostname start
             ├─936911 ovs-vswitchd unix:/var/run/openvswitch/db.sock -vconsole:emer -vsyslog:err -vfile:info --mlockall --no-chdir --log-file=/var/log/openv>
             └─936912 ovs-vswitchd unix:/var/run/openvswitch/db.sock -vconsole:emer -vsyslog:err -vfile:info --mlockall --no-chdir --log-file=/var/log/openv>

It also might be a red herring, but another suspicious error is present in the logs:

2021-09-17T12:18:56.458Z|00077|netdev_linux|INFO|ioctl(SIOCGIFINDEX) on genev_sys_6081 device failed: No such device
2021-09-17T12:18:56.458Z|00078|bridge|INFO|bridge br-int: added interface ovn-u0400s-8 on port 12
2021-09-17T12:18:56.462Z|00079|netdev_linux|INFO|ioctl(SIOCGIFINDEX) on genev_sys_6081 device failed: No such device
2021-09-17T12:18:56.463Z|00080|bridge|INFO|bridge br-int: added interface ovn-u0400s-6 on port 10
2021-09-17T12:18:56.467Z|00081|netdev_linux|INFO|ioctl(SIOCGIFINDEX) on genev_sys_6081 device failed: No such device
2021-09-17T12:18:56.467Z|00082|bridge|INFO|bridge br-int: added interface ovn-u0400s-7 on port 11
2021-09-17T12:18:56.471Z|00083|netdev_linux|INFO|ioctl(SIOCGIFINDEX) on genev_sys_6081 device failed: No such device
2021-09-17T12:18:56.479Z|00084|dpif_netdev|INFO|PMD thread on numa_id: 0, core id: 21 created.
2021-09-17T12:18:56.485Z|00085|dpif_netdev|INFO|PMD thread on numa_id: 1, core id: 88 created.
2021-09-17T12:18:56.485Z|00086|dpif_netdev|INFO|There are 1 pmd threads on numa node 0
2021-09-17T12:18:56.485Z|00087|dpif_netdev|INFO|There are 1 pmd threads on numa node 1

Removed the VM, cleared the logs, re-created the VM:

echo '' > /var/log/openvswitch/ovs-vswitchd.log
openstack server create --config-drive true --network ext_net_dpdk --key-name ubuntu-keypair --image auto-sync/ubuntu-focal-20.04-amd64-server-20210907-disk1.img --flavor m1.medium.dpdk dpdk-test-instance

it's restarting again...

# service ovs-vswitchd status
● ovs-vswitchd.service - Open vSwitch Forwarding Unit
     Loaded: loaded (/lib/systemd/system/ovs-vswitchd.service; static; vendor preset: enabled)
     Active: activating (start) since Fri 2021-09-17 12:24:40 UTC; 3s ago

but the patch interface and DPDK bonds are present: https://paste.ubuntu.com/p/FN5GbbhmBS/
ovs-vswitchd.log: https://paste.ubuntu.com/p/XTZT8HCQPz/ (capture start: log cleanup, capture end: VM is in ACTIVE state)

Revision history for this message
Corey Bryant (corey.bryant) wrote :

I've asked Vladimir if they can deploy on the same machines without DPDK to see if we can rule in/out dpdk.

Revision history for this message
Corey Bryant (corey.bryant) wrote :

The difference between openvswitch 2.13.3-0ubuntu0.20.04.1 and 2.13.3-0ubuntu0.20.04.2 is:
https://launchpadlibrarian.net/557518135/openvswitch_2.13.3-0ubuntu0.20.04.1_2.13.3-0ubuntu0.20.04.2.diff.gz

In case it's useful, the dates of openvswitch publishing were:
2.13.3-0ubuntu0.20.04.1 was published on 2021-05-16.
2.13.3-0ubuntu0.20.04.2 was published on 2021-09-08.

Revision history for this message
Vladimir Grevtsev (vlgrevtsev) wrote :

As requested by Corey, we have redeployed a single DPDK-enabled host, disabling DPDK (making it in fact a "generic" compute node) - and can confirm that instances are reachable over the provider network, if there is no DPDK enabled.

Revision history for this message
Liam Young (gnuoy) wrote :

The crashdump provided seems to cover the period before the `systemd-tmpfiles --create` "fix" was applied. Could you provide a crashdump that includes logs that cover launching a dpdk guest. If possible please include the uuid of the guest and the hypervisor that is hosting it.

Revision history for this message
Vladimir Grevtsev (vlgrevtsev) wrote :

We did some more investigation and it looks like we have a reproducer only when using a specific compute host (which was converted previously to the non-DPDK host). We will redeploy this host with DPDK again and see, if it makes any difference.

Reducing to field-high as almost all of the hosts are operating normally now without any DPDK-related issues.

Revision history for this message
Vladimir Grevtsev (vlgrevtsev) wrote :

> We will redeploy this host with DPDK again and see, if it makes any difference.

Redeployment completed and we can confirm two things:

1) original problem is gone everywhere, it cannot be reproduced anymore
2) networking-related problem is localized on the specific hardware node. We will open a separate bug to follow-up on that.

Given the above, I'll set this bug to Invalid and reopen, if the original issue will reproduce again. Thanks all for your time on this!

Changed in charm-nova-compute:
status: New → Invalid
Liam Young (gnuoy)
Changed in neutron (Ubuntu):
status: New → Invalid
Liam Young (gnuoy)
no longer affects: neutron
no longer affects: neutron (Ubuntu)
Changed in charm-layer-ovn:
status: New → Confirmed
importance: Undecided → High
assignee: nobody → Liam Young (gnuoy)
Revision history for this message
Liam Young (gnuoy) wrote :
Revision history for this message
Nobuto Murata (nobuto) wrote :

charm-ovn-chassis requires a rebuild to pull the charm-layer-ovn change.

Changed in charm-layer-ovn:
status: Confirmed → Fix Committed
Changed in charm-layer-ovn:
milestone: none → 21.10
Revision history for this message
Billy Olsen (billy-olsen) wrote :

Rebuild of the charms happen as part of the release process, which should pick this up. Marking charm as fix-committed as well.

Changed in charm-ovn-chassis:
status: New → Fix Committed
importance: Undecided → High
milestone: none → 21.10
Changed in charm-layer-ovn:
status: Fix Committed → Fix Released
Changed in charm-ovn-chassis:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.