We ran into this issue as well and as you have correctly stated, it's a result of libvirtd becoming unresponsive. Thus far we have not been able to resolve the issue at its core but I can offer you a workaround.
It's a simple script which calls virsh help (which will block indefinitely when the libvirtd process is unresponsive), wait 30 seconds for a response and restart libvirtd after the timeout has elapsed. Run it out of cron every 5 minutes or whatever you're comfortable doing.
By no means is this a fix but it'll mitigate the issue for you operationally for now. In every situation we've had the script trigger, nova-compute recovers as you would expect, without needing a restart.
We ran into this issue as well and as you have correctly stated, it's a result of libvirtd becoming unresponsive. Thus far we have not been able to resolve the issue at its core but I can offer you a workaround.
It's a simple script which calls virsh help (which will block indefinitely when the libvirtd process is unresponsive), wait 30 seconds for a response and restart libvirtd after the timeout has elapsed. Run it out of cron every 5 minutes or whatever you're comfortable doing.
https:/ /github. com/metacloud/ openstack- tools/blob/ master/ libvirt/ check_fix_ libvirtd
By no means is this a fix but it'll mitigate the issue for you operationally for now. In every situation we've had the script trigger, nova-compute recovers as you would expect, without needing a restart.
Hope that helps.