What Oleg described in https://bugs.launchpad.net/mos/+bug/1579037/comments/8 sounds like a race condition in Nova: I checked the code and we intentionally do no deallocate the port on rescheduling of an instance, at the same time - no additional allocations must be performed on the another compute node, if the original allocation succeeded. In the logs we clearly see that both nova-compute's allocated a port in Neutron, which eventually caused a test case to fail as it asserts on the number of ports (1 != 2).
That being said, I think it's a valid bug, but the user impact is moderate here - you might have allocated, but unused ports in Neutron, if instance failed to boot on one compute node and was rescheduled.
Again, this seems to be a race condition, as the check we perform in the nova code must have yielded false and skip the additional allocation. I checked the pass rate of this test case in both upstream and downstream: looks like we only had 2 failures out of 92 runs, so this is clearly not a blocker for 9.0.
I suggest we downgrade the importance to Medium and continue to work on this in 10.0.
What Oleg described in https:/ /bugs.launchpad .net/mos/ +bug/1579037/ comments/ 8 sounds like a race condition in Nova: I checked the code and we intentionally do no deallocate the port on rescheduling of an instance, at the same time - no additional allocations must be performed on the another compute node, if the original allocation succeeded. In the logs we clearly see that both nova-compute's allocated a port in Neutron, which eventually caused a test case to fail as it asserts on the number of ports (1 != 2).
That being said, I think it's a valid bug, but the user impact is moderate here - you might have allocated, but unused ports in Neutron, if instance failed to boot on one compute node and was rescheduled.
Again, this seems to be a race condition, as the check we perform in the nova code must have yielded false and skip the additional allocation. I checked the pass rate of this test case in both upstream and downstream: looks like we only had 2 failures out of 92 runs, so this is clearly not a blocker for 9.0.
I suggest we downgrade the importance to Medium and continue to work on this in 10.0.