Can't SSH into new VM, if DHCP agent's iptables-restore call fails
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
networking-calico |
New
|
Undecided
|
Unassigned |
Bug Description
Failed to SSH into 10.28.0.8:
10.28.0.8 is the IP just allocated to a newly created VM.
Observed during a test last night, with
- networking-calico 1.4.1
- Liberty OpenStack on CentOS 7.3.
VM was 'drb9q9', created at 01:03:10.
DHCP agent log for the relevant compute node shows an iptables-restore failure at 01:03:13:
2017-03-03 01:03:13,324 [23437552] (ERROR) utils:
Command: ['sudo', 'iptables-restore', '-n']
Exit code: 1
Stdin: # Generated by iptables_manager
*filter
-D FORWARD 1
-I FORWARD 3 -j felix-FORWARD
-I INPUT 1 -j calico-
-D INPUT 3
-D OUTPUT 1
-I OUTPUT 3 -j felix-OUTPUT
COMMIT
# Completed by iptables_manager
# Generated by iptables_manager
*nat
-I OUTPUT 1 -j calico-
-D OUTPUT 3
-D POSTROUTING 1
-I POSTROUTING 3 -j felix-POSTROUTING
-I PREROUTING 1 -j calico-
-D PREROUTING 3
COMMIT
# Completed by iptables_manager
# Generated by iptables_manager
*raw
-I OUTPUT 1 -j calico-
-D OUTPUT 3
-I PREROUTING 1 -j calico-
-D PREROUTING 3
COMMIT
# Completed by iptables_manager
Stdout:
Stderr: iptables-restore: line 9 failed
2017-03-03 01:03:13,325 [23437552] (ERROR) iptables_manager: IPTablesManager
5. -I INPUT 1 -j calico-
6. -D INPUT 3
7. -D OUTPUT 1
8. -I OUTPUT 3 -j felix-OUTPUT
9. COMMIT
10. # Completed by iptables_manager
11. # Generated by iptables_manager
12. *nat
13. -I OUTPUT 1 -j calico-
14. -D OUTPUT 3
2017-03-03 01:03:13,325 [23437552] (ERROR) agent: Unable to restart dhcp for 4e0cb6bb-
2017-03-03 01:03:13.325 16177 ERROR neutron.
2017-03-03 01:03:13.325 16177 ERROR neutron.
2017-03-03 01:03:13.325 16177 ERROR neutron.
2017-03-03 01:03:13.325 16177 ERROR neutron.
2017-03-03 01:03:13.325 16177 ERROR neutron.
2017-03-03 01:03:13.325 16177 ERROR neutron.
2017-03-03 01:03:13.325 16177 ERROR neutron.
2017-03-03 01:03:13.325 16177 ERROR neutron.
2017-03-03 01:03:13.325 16177 ERROR neutron.
2017-03-03 01:03:13.325 16177 ERROR neutron.
2017-03-03 01:03:13.325 16177 ERROR neutron.
2017-03-03 01:03:13.325 16177 ERROR neutron.
2017-03-03 01:03:13.325 16177 ERROR neutron.
2017-03-03 01:03:13.325 16177 ERROR neutron.
2017-03-03 01:03:13.325 16177 ERROR neutron.
2017-03-03 01:03:13.325 16177 ERROR neutron.
2017-03-03 01:03:13.325 16177 ERROR neutron.
2017-03-03 01:03:13.325 16177 ERROR neutron.
2017-03-03 01:03:13.325 16177 ERROR neutron.
2017-03-03 01:03:13.325 16177 ERROR neutron.
2017-03-03 01:03:13.325 16177 ERROR neutron.
2017-03-03 01:03:13.325 16177 ERROR neutron.
2017-03-03 01:03:13.325 16177 ERROR neutron.
2017-03-03 01:03:13.325 16177 ERROR neutron.
2017-03-03 01:03:13.325 16177 ERROR neutron.
2017-03-03 01:03:13.325 16177 ERROR neutron.
2017-03-03 01:03:13.325 16177 ERROR neutron.
2017-03-03 01:03:13.325 16177 ERROR neutron.
2017-03-03 01:03:13.325 16177 ERROR neutron.
2017-03-03 01:03:13.325 16177 ERROR neutron.
2017-03-03 01:03:13.325 16177 ERROR neutron.
2017-03-03 01:03:13.325 16177 ERROR neutron.
2017-03-03 01:03:13.325 16177 ERROR neutron.
2017-03-03 01:03:13.325 16177 ERROR neutron.
2017-03-03 01:03:13.325 16177 ERROR neutron.
2017-03-03 01:03:13.325 16177 ERROR neutron.
2017-03-03 01:03:13.325 16177 ERROR neutron.
2017-03-03 01:03:13.325 16177 ERROR neutron.
2017-03-03 01:03:13.325 16177 ERROR neutron.
2017-03-03 01:03:13.325 16177 ERROR neutron.
2017-03-03 01:03:13.325 16177 ERROR neutron.
2017-03-03 01:03:13.325 16177 ERROR neutron.
2017-03-03 01:03:13.325 16177 ERROR neutron.
2017-03-03 01:03:13.325 16177 ERROR neutron.
2017-03-03 01:03:13.325 16177 ERROR neutron.
2017-03-03 01:03:13.325 16177 ERROR neutron.
2017-03-03 01:03:13.325 16177 ERROR neutron.
2017-03-03 01:03:13.325 16177 ERROR neutron.
2017-03-03 01:03:13.325 16177 ERROR neutron.
2017-03-03 01:03:13.325 16177 ERROR neutron.
2017-03-03 01:03:13.325 16177 ERROR neutron.
2017-03-03 01:03:13.325 16177 ERROR neutron.
2017-03-03 01:03:13.325 16177 ERROR neutron.
2017-03-03 01:03:13.325 16177 ERROR neutron.
2017-03-03 01:03:13.325 16177 ERROR neutron.
2017-03-03 01:03:13.325 16177 ERROR neutron.
2017-03-03 01:03:13.325 16177 ERROR neutron.
2017-03-03 01:03:13.325 16177 ERROR neutron.
2017-03-03 01:03:13.325 16177 ERROR neutron.
syslog shows that there was no dnsmasq restart between 00:56:38 and 01:27:09. So it looks like the iptables-restore failure caused the DHCP agent to bail completely on the setup that is needed to support a new VM IP.
Actually I believe this issue has already been addressed in networking-calico master by https:/ /review. openstack. org/#/c/ 433702/.