archive_deleted_rows archives pci_devices records as residue because of 'instance_uuid'
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
OpenStack Compute (nova) |
Fix Released
|
Medium
|
melanie witt | ||
Queens |
In Progress
|
Undecided
|
melanie witt | ||
Rocky |
In Progress
|
Undecided
|
melanie witt | ||
Stein |
Fix Released
|
Undecided
|
melanie witt | ||
Train |
Fix Released
|
Undecided
|
melanie witt | ||
Ussuri |
Fix Released
|
Undecided
|
melanie witt | ||
Victoria |
Fix Released
|
Undecided
|
melanie witt |
Bug Description
This is based on a bug reported downstream [1] where after a random amount of time, update_
"traceback": [
"Traceback (most recent call last):",
" File \"/usr/
" rt.update_
" File \"/usr/
" self._update_
" File \"/usr/
" return f(*args, **kwargs)",
" File \"/usr/
" self._update(
" File \"/usr/
" self.pci_
" File \"/usr/
" dev.save()",
" File \"/usr/
" ctxt, self, fn.__name__, args, kwargs)",
" File \"/usr/
" objmethod=
" File \"/usr/
" retry=self.retry)",
" File \"/usr/
" timeout=timeout, retry=retry)",
" File \"/usr/
" retry=retry)",
" File \"/usr/
" raise result",
"RemoteError: Remote error: DBError (pymysql.
Here ^ we see an attempt to insert a nearly empty (NULL fields) record into the pci_devices table. Inspection of the code shows that the way this can occur is if we fail to lookup the pci_devices record we want and then we try to create a new one [2]:
@pick_context_
def pci_device_
query = model_query(
if query.update(
device = models.PciDevice()
return query.one()
Turns out what was happening was when a request came in to delete an instance that had allocated a PCI device, if the archive_
So after the pci_devices record was swept away, we tried to update the resource tracker as part of the _complete_deletion method in the compute manager and that failed because we could not locate the pci_devices record to free the PCI device (null out the instance_uuid field).
What we need to do here is not to treat the pci_devices table records as instance residue. The records in pci_devices are not tied to instance lifecycles at all and they are managed independently by the PCI trackers.
[1] https:/
[2] https:/
Fix proposed to branch: master /review. opendev. org/757656
Review: https:/