STX-Openstack: Failed to activate binding for port for live migration
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
StarlingX |
In Progress
|
Undecided
|
Thales Elero Cervi |
Bug Description
Brief Description
-----------------
When live migrating a VM with hw_cpu_
Severity
--------
Minor: System/Feature is usable with minor issue
Steps to Reproduce
------------------
1. Create an image with hw_cpu_
2. Boot a VM with said image and any flavor
3. Live migrate this VM
Expected Behavior
------------------
VM is live migrated
Actual Behavior
----------------
VM failed to live migrate with "Failed to activate binding for port..." error
Reproducibility
---------------
Intermittent - Passed on retest
System Configuration
-------
Bare metal AIO-DX
Branch/Pull Time/Commit
-------
https:/
Last Pass
---------
Last tested on CentOS: https:/
Timestamp/Logs
--------------
+------
| Field | Value |
+------
| OS-DCF:diskConfig | MANUAL |
| OS-EXT-
| OS-EXT-
| OS-EXT-
| OS-EXT-
| OS-EXT-
| OS-EXT-
| OS-EXT-STS:vm_state | error |
| OS-SRV-
| OS-SRV-
| accessIPv4 | |
| accessIPv6 | |
| addresses | tenant1-
| config_drive | |
| created | 2023-03-
| fault | {'code': 500, 'created': '2023-03-
| flavor | cpu_pol (51dbd000-
| hostId | 8df03ede866f186
| id | c8598b68-
| image | N/A (booted from volume) |
| key_name | keypair-tenant1 |
| name | tenant1-
| project_id | 4a5485e632da44d
| properties | |
| security_groups | name='default' |
| | name='default' |
| status | ERROR |
| updated | 2023-03-
| user_id | c61a798d6178409
| volumes_attached | id='0d2ef2e5-
+------
+----+-
| Id | UUID | Source Node | Dest Node | Source Compute | Dest Compute | Dest Host | Status | Instance UUID | Old Flavor | New Flavor | Created At | Updated At | Type | Project ID | User ID |
+----+-
| 11 | 85db3a9a-
+----+-
Test Activity
-------------
Regression Testing
Workaround
----------
Retry the live migration
description: | updated |
Changed in starlingx: | |
assignee: | nobody → Thales Elero Cervi (tcervi) |
status: | New → In Progress |
As peer this LP description, the issue is Intermittent. Indeed, this VM migration error message: 'Failed to activate binding for port 5d7670a1- f18d-4b82- 9c2a-1ec1dee6a8 35 and host controller-1.' is usually related to services intermittences on the target host (e.g. network agents, hypervisor).
I executed the case_8_ test_cpu_ pol_vm_ actions[ 2-dedicated- image-volume] 20 times in a loop and got no errors. Also, this LP description states that the scenario passed when retested (executed alone).
All this to say that the migration issue seems to me like a "side-effect" of something else that left controller-1 on a bad state. I know that the Regression Suite that (executes this case_8_ test_cpu_ pol_vm_ actions) runs after the Sanity Suite and case_8_ test_cpu_ pol_vm_ actions seems to be the first Test Case that tries a VM migration. with_vms" and after this test it might be the case that controller-1 takes too long to reestablish its services after the unlock or (even worst) that the services are failing to be reestablished after the unlock.
Also, Sanity Suite's last Test Case is "test_lock_
I will dive deeper on it and update this LP accordingly.