Comment 35 for bug 1643911

Revision history for this message
davidchen (synodavidchen) wrote :

@ChristianEhrhardt -

After few days stress, the issue cat not be reproduced by the script I provide.

So I fallback to use a more simple script, which I use at the beginning to reproduce the issue in my environment, a script just start/stop instance and sleep (without vm status check).

After change to use this script, the issue is reproducible (but require few days in my test), with libvirt-bin(1.3.1-1ubuntu10.8) and qemu(1:2.5+dfsg-5ubuntu10.11) and 20 cirros guest.

The real instance behavior is more complex in the simple script, because sometimes the openstack api will encounter conflict error due to vm status is not match

Cannot 'stop' instance 971fe132-55de-4ff7-b1e8-c556390964c1 while it is in vm_state stopped (HTTP 409)
Cannot 'start' instance 01eae3e9-6330-4594-a1dd-dcdbf3a4d392 while it is in vm_state active (HTTP 409)

So some instance may be start/stop in the same time, due to some action may not be apply on previous command, and some pattern seems trigger libvirt crash.

However, due to the behavior is very complex, so I'm not sure if the issue is reproducible on other environment.

For example, if sleep time is long enough to let all vm finish their task, than there will no conflict error, and the script behavior will like the script I provide behavior, which can not reproduce the issue. On the other hand, if the sleep time is too short or too much guest, the openstack environment may not able to handle the request rate, which may lead some other client or server error