Live migrations failing due to remote host identification change
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
OpenStack Compute (nova) |
New
|
Undecided
|
Unassigned | ||
OpenStack Nova Cloud Controller Charm |
In Progress
|
Undecided
|
Edward Hope-Morley |
Bug Description
I've encountered a cloud where, for some reason (maybe a redeploy of a compute; I'm not sure), I'm hitting this error in nova-compute.log on the source node for an instance migration:
2022-04-22 10:21:17.419 3776 ERROR nova.virt.
@ WARNING: REMOTE HOST IDENTIFICATION HAS CHANGED! @
@@@@@@@
IT IS POSSIBLE THAT SOMEONE IS DOING SOMETHING NASTY!
Someone could be eavesdropping on you right now (man-in-the-middle attack)!
It is also possible that a host key has just been changed.
The fingerprint for the RSA key sent by the remote host is
SHA256:<REDACTED FINGERPRINT>.
Please contact your system administrator.
Add correct host key in /root/.
Offending RSA key in /root/.
remove with:
ssh-keygen -f "/root/
RSA host key for <REDACTED IP> has changed and you have requested strict checking.
Host key verification failed.: Connection reset by peer: libvirt.
This interferes with instance migration.
There is a workaround:
* Manually ssh to the destination node, both as the root and nova users on the source node.
* Manually clear the offending known_hosts entries reported by the SSH command.
* Verify that once cleared, the root and nova users are able to successfully connect via SSH.
Obviously, this is cumbersome in the case of clouds with high numbers of compute nodes. It'd be better if the charm was able to avoid this issue.
Changed in charm-nova-cloud-controller: | |
assignee: | nobody → Edward Hope-Morley (hopem) |
Nova-cc has an action to redo all of the host keys when redeploying etc. Check out the "clear- unit-knownhost- cache" action. Also check whether hostname caching is on (config "cache- known-hosts= true") If this is set to true (the default) then changes in hosts or DNS resolution will result in stale information on the nova-compute units.
If it's neither of those things, then we have a bug.