PCI Leaks when multiple detach operations performed in parallel

Bug #2033247 reported by Amit Gupta
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
New
Undecided
Unassigned

Bug Description

Description
===========
We are using the Openstack yoga release and we need to attach/detach the port to a VM dynamically. We are observing the PCI leaks while doing multiple detach operations simultaneously.

PCI starts leaking when one of the openstack table “instance_extra” is exhausted. Once this table is exhausted then openstack is not able to attach/detach a port and it starts leaking PCIs due to exception as it can’t perform any action. This table is used by Openstack to record the all historical records of “PCIRequest” for the all interfaces attached with a VM.

DBDataError (pymysql.err.DataError) (1406, "Data too long for column 'pci_requests' at row 1")
[SQL: UPDATE instance_extra SET updated_at=%(updated_at)s, device_metadata=%(device_metadata)s, numa_topology=%(numa_topology)s, pci_requests=%(pci_requests)s, flavor=%(flavor)s WHERE instance_extra.deleted = %(deleted_1)s AND instance_extra.instance_uuid = %(instance_uuid_1)s]
[parameters: {'updated_at': datetime.datetime(2023, 8, 3, 14, 39, 56, 116791), 'device_metadata': '{"nova_object.name": "InstanceDeviceMetadata", "nova_object.namespace": "nova", "nova_object.version": "1.0", "nova_object.data": {"devices": [{"nova ... (6168 characters truncated) ... :00:14.0"}, "nova_object.changes": ["address"]}}, "nova_object.changes": ["bus", "vf_trusted", "mac", "vlan"]}]}, "nova_object.changes": ["devices"]}', 'numa_topology': '{"nova_object.name": "InstanceNUMATopology", "nova_object.namespace": "nova", "nova_object.version": "1.3", "nova_object.data": {"cells": [{"nova_obj ... (946 characters truncated) ... nges": ["id", "cpu_pinning_raw", "cpuset_reserved"]}], "emulator_threads_policy": null}, "nova_object.changes": ["emulator_threads_policy", "cells"]}', 'pci_requests': '[{"count": 1, "spec": [{"physical_network": "sriov3", "remote_managed": "False"}], "alias_name": null, "is_new": false, "numa_policy": null, "request ... (65464 characters truncated) ... "is_new": false, "numa_policy": null, "request_id": "2d5ef4bd-d499-4e62-a617-75ed4535c930", "requester_id": "f4fabf3b-ccc1-4117-bb30-de53a9a55d66"}]', 'flavor': '{"cur": {"nova_object.name": "Flavor", "nova_object.namespace": "nova", "nova_object.version": "1.2", "nova_object.data": {"id": 71, "name": "SOLTEST ... (481 characters truncated) ... "2023-07-04T05:45:02Z", "updated_at": null, "deleted_at": null, "deleted": false}, "nova_object.changes": ["extra_specs"]}, "old": null, "new": null}', 'deleted_1': 0, 'instance_uuid_1': 'dd5d0568-1aad-47ed-8418-78a6c75363cc'}]

I validated the count of “PCIRequestRecord” stored in “'pci_requests’” field of table “instance_extra” and I found that 260 records are stored which is somewhat equivalent to the number of attach operations performed on this node before we started seeing the PCI leak as reported by Mohit in his mail below. This also indicate that “PciRequest” record for ports created via operator is not getting cleaned up even after that port is detached/deleted

I was suspecting that this is another issue in openstack when openstack handling the parallel detach requests as received from the operator. I did one exercise to prove that . In my test case, I was having one pod with 2 sriov vnics and I performed the attach/detach operations in a loop then we were hitting the PCI leak issue after multiple attached/detached operations. My hunch was that openstack is responding to a detach request immediately while a detachment of that interface is not completed in backend and nova service is unable to handle another detached operation for sriov ports when other is still in progress which leads to the backed up of one of the openstack table. There was barely a gap between 2 successive detached request sent to the openstack as you can see that we operator is sending the detached request immediately just after the first detached.

To prove this, we introduced a delay of 10 seconds in our code to serialize the detached operations to avoid a possibility where a detached operation is still pending with openstack services for the previous detach. After that I am successfully able to execute the following test cases
• 600 attach/detach operations for a single pod with 2 sriov vnics
• 400 attach/detach operations for 4 pods; each with 2 sriov vnics.
• 320 attach/detach operations for 8 pods; each with 2 sriov vnics.

Therefore, we did ~1300 vnic attach/detach operations and I don’t see any leak with these changes. The PCI Pool is completely available after that many attach/detach operations

This proves that openstack is not able to handle the simultaneous detach operations in yoga release.

Steps to reproduce.
==================
- attach 2+ sriov ports to a VM
- send detach requests
- Repeat this iteration multiple times and monitor "instace_extra" table.

Expected result
===============
- Instane_extra table shall not have stale PCIRequest entries

Actual result
=============

1) Port detached but "instance_extra" table is still having stale PCIRequest entries

Environment
===========
1. Openstack version: Yoga

  rpm -qa | grep nova
  python3-novaclient-17.7.0-1.el8.noarch
  openstack-nova-conductor-25.2.0-1.el8.noarch
  python3-nova-25.2.0-1.el8.noarch
  openstack-nova-common-25.2.0-1.el8.noarch
  openstack-nova-scheduler-25.2.0-1.el8.noarch
  openstack-nova-api-25.2.0-1.el8.noarch
  openstack-nova-novncproxy-25.2.0-1.el8.noarch

2. Which hypervisor did you use?

  Libvirt + KVM

  What's the version of that?

  libvirt-7.6.0-6.el8.x86_64
  qemu-kvm-6.0.0-33.el8.x86_64

2. Which storage type did you use?

  This issue is storage independent.

3. Which networking type did you use?
   Neutron + openvswitch + sriovnicswitch

Amit Gupta (amikugup)
description: updated
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.