We hit this in queens CI where we go to swap instance allocations to the migration uuid during a resize and the instance allocations on the source node are not found, which shouldn't happen:
http://logs.openstack.org/08/516708/3/gate/legacy-tempest-dsvm-py35/4d8d6a3/logs/screen-n-super-cond.txt.gz#_Oct_31_23_18_04_391235
Oct 31 23:18:04.391235 ubuntu-xenial-rax-iad-0000635032 nova-conductor[22376]: ERROR nova.conductor.tasks.migrate [None req-2bd7178b-307e-4342-a156-f9645b7f75a5 tempest-MigrationsAdminTest-834796780 tempest-MigrationsAdminTest-834796780] [instance: 87329b8a-0fa5-4467-b69e-6c43d4633f54] Unable to find existing allocations for instance
Oct 31 23:18:04.421480 ubuntu-xenial-rax-iad-0000635032 nova-conductor[22376]: WARNING nova.scheduler.utils [None req-2bd7178b-307e-4342-a156-f9645b7f75a5 tempest-MigrationsAdminTest-834796780 tempest-MigrationsAdminTest-834796780] Failed to compute_task_migrate_server: Instance 87329b8a-0fa5-4467-b69e-6c43d4633f54 is unacceptable: Instance has no source node allocation: nova.exception.InstanceUnacceptable: Instance 87329b8a-0fa5-4467-b69e-6c43d4633f54 is unacceptable: Instance has no source node allocation
Looking in Placement for that instance, it's allocations are created by the scheduler here:
http://logs.openstack.org/08/516708/3/gate/legacy-tempest-dsvm-py35/4d8d6a3/logs/screen-placement-api.txt.gz#_Oct_31_23_18_00_805083
Oct 31 23:18:00.637846 ubuntu-xenial-rax-iad-0000635032 <email address hidden>[15777]: DEBUG nova.api.openstack.placement.requestlog [None req-d285a5a5-704b-4f46-98d5-f6e5ae7f6b4f service placement] Starting request: 104.239.175.193 "PUT /placement/allocations/87329b8a-0fa5-4467-b69e-6c43d4633f54" {{(pid=15780) __call__ /opt/stack/new/nova/nova/api/openstack/placement/requestlog.py:38}}
Oct 31 23:18:00.684152 ubuntu-xenial-rax-iad-0000635032 <email address hidden>[15777]: DEBUG nova.api.openstack.placement.handlers.allocation [None req-d285a5a5-704b-4f46-98d5-f6e5ae7f6b4f service placement] Successfully wrote allocations AllocationList[Allocation(consumer_id=87329b8a-0fa5-4467-b69e-6c43d4633f54,id=23,project_id='d349ceecbff744a2943294be5bb7e427',resource_class='MEMORY_MB',resource_provider=ResourceProvider(ec18595d-7007-47c9-bf13-38c8cf8a8bb0),used=64,user_id='d44d813424704df8996b7d77840283c9'), Allocation(consumer_id=87329b8a-0fa5-4467-b69e-6c43d4633f54,id=24,project_id='d349ceecbff744a2943294be5bb7e427',resource_class='VCPU',resource_provider=ResourceProvider(ec18595d-7007-47c9-bf13-38c8cf8a8bb0),used=1,user_id='d44d813424704df8996b7d77840283c9')] {{(pid=15780) _set_allocations /opt/stack/new/nova/nova/api/openstack/placement/handlers/allocation.py:249}}
And shortly after that we see them deleted:
http://logs.openstack.org/08/516708/3/gate/legacy-tempest-dsvm-py35/4d8d6a3/logs/screen-placement-api.txt.gz#_Oct_31_23_18_00_805083
Oct 31 23:18:00.805083 ubuntu-xenial-rax-iad-0000635032 <email address hidden>[15777]: DEBUG nova.api.openstack.placement.requestlog [None req-1fc99e13-67ca-4180-9a7f-d3a01a219b15 service placement] Starting request: 104.239.175.193 "DELETE /placement/allocations/87329b8a-0fa5-4467-b69e-6c43d4633f54" {{(pid=15780) __call__ /opt/stack/new/nova/nova/api/openstack/placement/requestlog.py:38}}
Oct 31 23:18:00.814342 ubuntu-xenial-rax-iad-0000635032 <email address hidden>[15777]: DEBUG nova.api.openstack.placement.handlers.allocation [None req-1fc99e13-67ca-4180-9a7f-d3a01a219b15 service placement] Successfully deleted allocations AllocationList[Allocation(consumer_id=87329b8a-0fa5-4467-b69e-6c43d4633f54,id=<?>,project_id=<?>,resource_class='MEMORY_MB',resource_provider=ResourceProvider(ec18595d-7007-47c9-bf13-38c8cf8a8bb0),used=64,user_id=<?>), Allocation(consumer_id=87329b8a-0fa5-4467-b69e-6c43d4633f54,id=<?>,project_id=<?>,resource_class='VCPU',resource_provider=ResourceProvider(ec18595d-7007-47c9-bf13-38c8cf8a8bb0),used=1,user_id=<?>)] {{(pid=15780) delete_allocations /opt/stack/new/nova/nova/api/openstack/placement/handlers/allocation.py:307}}
Looking at the compute service logs, it looks like what is happening is the update_available_resource periodic task on the compute is running in between the time that the allocations are created on the compute node via the scheduler and before the instance is created in the cell database:
http://logs.openstack.org/08/516708/3/gate/legacy-tempest-dsvm-py35/4d8d6a3/logs/screen-n-cpu.txt.gz#_Oct_31_23_18_00_165850
http://logs.openstack.org/08/516708/3/gate/legacy-tempest-dsvm-py35/4d8d6a3/logs/screen-n-cpu.txt.gz#_Oct_31_23_18_00_780729
Oct 31 23:18:00.780729 ubuntu-xenial-rax-iad-0000635032 nova-compute[23251]: DEBUG nova.compute.resource_tracker [None req-9743bef5-318d-4e7e-a71b-26c62cd0af2d tempest-MigrationsAdminTest-834796780 tempest-MigrationsAdminTest-834796780] Instance 87329b8a-0fa5-4467-b69e-6c43d4633f54 has been deleted (perhaps locally). Deleting allocations that remained for this instance against this compute host: {'resources': {'VCPU': 1, 'MEMORY_MB': 64}}. {{(pid=23251) _remove_deleted_instances_allocations /opt/stack/new/nova/nova/compute/resource_tracker.py:1180}}
Oct 31 23:18:00.817854 ubuntu-xenial-rax-iad-0000635032 nova-compute[23251]: INFO nova.scheduler.client.report [None req-9743bef5-318d-4e7e-a71b-26c62cd0af2d tempest-MigrationsAdminTest-834796780 tempest-MigrationsAdminTest-834796780] Deleted allocation for instance 87329b8a-0fa5-4467-b69e-6c43d4633f54
That's because of this code that gets an InstanceNotFound and assumes the instance was deleted:
https://github.com/openstack/nova/blob/f974e3c3566f379211d7fdc790d07b5680925584/nova/compute/resource_tracker.py#L1169-L1181
The compute service shouldn't have access to the nova_api database so we can't check to see if the instance is still represented as a build request, so we'll have to redo this deleted instance logic in the resource tracker to fix this race bug.
http:// logstash. openstack. org/#dashboard/ file/logstash. json?query= message% 3A%5C%22Unable% 20to%20find% 20existing% 20allocations% 20for%20instanc e%5C%22% 20AND%20tags% 3A%5C%22screen- n-super- cond.txt% 5C%22&from= 7d