Comment 17 for bug 1457404

Revision history for this message
Roman Podoliaka (rpodolyaka) wrote :

We added additional introspection while reproducing the problem on a clean env.

Right after deployment is complete, when running an OSTF test, the very first instance is created. nova-network lazily creates the bridge (br100) and spawns dnsmasq daemon.

In dnsmasq logs we can see the instance gets an IP address correctly: http://paste.openstack.org/show/239858/
tcpdump logs: http://paste.openstack.org/show/239857/

The problem we can see in the logs is that DHCPRELEASE packet sent by nova-network on behalf of the instance is missing, which means dnsmasq still thinks the lease is used, when it should have expired. The next booted instance won't get an IP address, if the same IP address is allocated by nova-network.

The curios thing is that according to tcpdump logs, DHCPRELEASE has actually been sent correctly (the last one sent from 10.0.0.1, which has length of 548 bytes). And tracing of dnsmasq system calls shows that dnsmasq has seen the message, but for some reason ignored it - http://paste.openstack.org/show/239864/

The subsequent DHCPRELEASE packets are handled correctly. Looks like we ran into some edge case with a newly spawned dnsmasq daemon. We can't reproduce the problem on an existing environment after that.