Failed to SSH into 10.28.0.8: 10.28.0.8 is the IP just allocated to a newly created VM.
Observed during a test last night, with - master networking-calico code - Liberty OpenStack on CentOS 7.3.
VM was 'drb9q9', created at 01:03:10.
DHCP agent log for the relevant compute node shows an iptables-restore failure at 01:03:13:
2017-03-03 01:03:13,324 [23437552] (ERROR) utils: Command: ['sudo', 'iptables-restore', '-n'] Exit code: 1 Stdin: # Generated by iptables_manager *filter -D FORWARD 1 -I FORWARD 3 -j felix-FORWARD -I INPUT 1 -j calico-dhcp-agen-INPUT -D INPUT 3 -D OUTPUT 1 -I OUTPUT 3 -j felix-OUTPUT COMMIT # Completed by iptables_manager # Generated by iptables_manager *nat -I OUTPUT 1 -j calico-dhcp-agen-OUTPUT -D OUTPUT 3 -D POSTROUTING 1 -I POSTROUTING 3 -j felix-POSTROUTING -I PREROUTING 1 -j calico-dhcp-agen-PREROUTING -D PREROUTING 3 COMMIT # Completed by iptables_manager # Generated by iptables_manager *raw -I OUTPUT 1 -j calico-dhcp-agen-OUTPUT -D OUTPUT 3 -I PREROUTING 1 -j calico-dhcp-agen-PREROUTING -D PREROUTING 3 COMMIT # Completed by iptables_manager
Stdout: Stderr: iptables-restore: line 9 failed
2017-03-03 01:03:13,325 [23437552] (ERROR) iptables_manager: IPTablesManager.apply failed to apply the following set of iptables rules: 5. -I INPUT 1 -j calico-dhcp-agen-INPUT 6. -D INPUT 3 7. -D OUTPUT 1 8. -I OUTPUT 3 -j felix-OUTPUT 9. COMMIT 10. # Completed by iptables_manager 11. # Generated by iptables_manager 12. *nat 13. -I OUTPUT 1 -j calico-dhcp-agen-OUTPUT 14. -D OUTPUT 3 2017-03-03 01:03:13,325 [23437552] (ERROR) agent: Unable to restart dhcp for 4e0cb6bb-bbf1-48e9-90d1-a4afa34e2fc6. 2017-03-03 01:03:13.325 16177 ERROR neutron.agent.dhcp.agent Traceback (most recent call last): 2017-03-03 01:03:13.325 16177 ERROR neutron.agent.dhcp.agent File "/usr/lib/python2.7/site-packages/neutron/agent/dhcp/agent.py", line 115, in call_driver 2017-03-03 01:03:13.325 16177 ERROR neutron.agent.dhcp.agent getattr(driver, action)(**action_kwargs) 2017-03-03 01:03:13.325 16177 ERROR neutron.agent.dhcp.agent File "/usr/lib/python2.7/site-packages/neutron/agent/linux/dhcp.py", line 138, in restart 2017-03-03 01:03:13.325 16177 ERROR neutron.agent.dhcp.agent self.enable() 2017-03-03 01:03:13.325 16177 ERROR neutron.agent.dhcp.agent File "/usr/lib/python2.7/site-packages/neutron/agent/linux/dhcp.py", line 206, in enable 2017-03-03 01:03:13.325 16177 ERROR neutron.agent.dhcp.agent interface_name = self.device_manager.setup(self.network) 2017-03-03 01:03:13.325 16177 ERROR neutron.agent.dhcp.agent File "/usr/lib/python2.7/site-packages/neutron/agent/linux/dhcp.py", line 1233, in setup 2017-03-03 01:03:13.325 16177 ERROR neutron.agent.dhcp.agent self.fill_dhcp_udp_checksums(namespace=network.namespace) 2017-03-03 01:03:13.325 16177 ERROR neutron.agent.dhcp.agent File "/usr/lib/python2.7/site-packages/neutron/agent/linux/dhcp.py", line 1299, in fill_dhcp_udp_checksums 2017-03-03 01:03:13.325 16177 ERROR neutron.agent.dhcp.agent iptables_mgr.apply() 2017-03-03 01:03:13.325 16177 ERROR neutron.agent.dhcp.agent File "/usr/lib/python2.7/site-packages/neutron/agent/linux/iptables_manager.py", line 423, in apply 2017-03-03 01:03:13.325 16177 ERROR neutron.agent.dhcp.agent return self._apply() 2017-03-03 01:03:13.325 16177 ERROR neutron.agent.dhcp.agent File "/usr/lib/python2.7/site-packages/neutron/agent/linux/iptables_manager.py", line 431, in _apply 2017-03-03 01:03:13.325 16177 ERROR neutron.agent.dhcp.agent return self._apply_synchronized() 2017-03-03 01:03:13.325 16177 ERROR neutron.agent.dhcp.agent File "/usr/lib/python2.7/site-packages/neutron/agent/linux/iptables_manager.py", line 508, in _apply_synchronized 2017-03-03 01:03:13.325 16177 ERROR neutron.agent.dhcp.agent '\n'.join(log_lines)) 2017-03-03 01:03:13.325 16177 ERROR neutron.agent.dhcp.agent File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 195, in __exit__ 2017-03-03 01:03:13.325 16177 ERROR neutron.agent.dhcp.agent six.reraise(self.type_, self.value, self.tb) 2017-03-03 01:03:13.325 16177 ERROR neutron.agent.dhcp.agent File "/usr/lib/python2.7/site-packages/neutron/agent/linux/iptables_manager.py", line 487, in _apply_synchronized 2017-03-03 01:03:13.325 16177 ERROR neutron.agent.dhcp.agent run_as_root=True) 2017-03-03 01:03:13.325 16177 ERROR neutron.agent.dhcp.agent File "/usr/lib/python2.7/site-packages/neutron/agent/linux/utils.py", line 159, in execute 2017-03-03 01:03:13.325 16177 ERROR neutron.agent.dhcp.agent raise RuntimeError(m) 2017-03-03 01:03:13.325 16177 ERROR neutron.agent.dhcp.agent RuntimeError: 2017-03-03 01:03:13.325 16177 ERROR neutron.agent.dhcp.agent Command: ['sudo', 'iptables-restore', '-n'] 2017-03-03 01:03:13.325 16177 ERROR neutron.agent.dhcp.agent Exit code: 1 2017-03-03 01:03:13.325 16177 ERROR neutron.agent.dhcp.agent Stdin: # Generated by iptables_manager 2017-03-03 01:03:13.325 16177 ERROR neutron.agent.dhcp.agent *filter 2017-03-03 01:03:13.325 16177 ERROR neutron.agent.dhcp.agent -D FORWARD 1 2017-03-03 01:03:13.325 16177 ERROR neutron.agent.dhcp.agent -I FORWARD 3 -j felix-FORWARD 2017-03-03 01:03:13.325 16177 ERROR neutron.agent.dhcp.agent -I INPUT 1 -j calico-dhcp-agen-INPUT 2017-03-03 01:03:13.325 16177 ERROR neutron.agent.dhcp.agent -D INPUT 3 2017-03-03 01:03:13.325 16177 ERROR neutron.agent.dhcp.agent -D OUTPUT 1 2017-03-03 01:03:13.325 16177 ERROR neutron.agent.dhcp.agent -I OUTPUT 3 -j felix-OUTPUT 2017-03-03 01:03:13.325 16177 ERROR neutron.agent.dhcp.agent COMMIT 2017-03-03 01:03:13.325 16177 ERROR neutron.agent.dhcp.agent # Completed by iptables_manager 2017-03-03 01:03:13.325 16177 ERROR neutron.agent.dhcp.agent # Generated by iptables_manager 2017-03-03 01:03:13.325 16177 ERROR neutron.agent.dhcp.agent *nat 2017-03-03 01:03:13.325 16177 ERROR neutron.agent.dhcp.agent -I OUTPUT 1 -j calico-dhcp-agen-OUTPUT 2017-03-03 01:03:13.325 16177 ERROR neutron.agent.dhcp.agent -D OUTPUT 3 2017-03-03 01:03:13.325 16177 ERROR neutron.agent.dhcp.agent -D POSTROUTING 1 2017-03-03 01:03:13.325 16177 ERROR neutron.agent.dhcp.agent -I POSTROUTING 3 -j felix-POSTROUTING 2017-03-03 01:03:13.325 16177 ERROR neutron.agent.dhcp.agent -I PREROUTING 1 -j calico-dhcp-agen-PREROUTING 2017-03-03 01:03:13.325 16177 ERROR neutron.agent.dhcp.agent -D PREROUTING 3 2017-03-03 01:03:13.325 16177 ERROR neutron.agent.dhcp.agent COMMIT 2017-03-03 01:03:13.325 16177 ERROR neutron.agent.dhcp.agent # Completed by iptables_manager 2017-03-03 01:03:13.325 16177 ERROR neutron.agent.dhcp.agent # Generated by iptables_manager 2017-03-03 01:03:13.325 16177 ERROR neutron.agent.dhcp.agent *raw 2017-03-03 01:03:13.325 16177 ERROR neutron.agent.dhcp.agent -I OUTPUT 1 -j calico-dhcp-agen-OUTPUT 2017-03-03 01:03:13.325 16177 ERROR neutron.agent.dhcp.agent -D OUTPUT 3 2017-03-03 01:03:13.325 16177 ERROR neutron.agent.dhcp.agent -I PREROUTING 1 -j calico-dhcp-agen-PREROUTING 2017-03-03 01:03:13.325 16177 ERROR neutron.agent.dhcp.agent -D PREROUTING 3 2017-03-03 01:03:13.325 16177 ERROR neutron.agent.dhcp.agent COMMIT 2017-03-03 01:03:13.325 16177 ERROR neutron.agent.dhcp.agent # Completed by iptables_manager 2017-03-03 01:03:13.325 16177 ERROR neutron.agent.dhcp.agent 2017-03-03 01:03:13.325 16177 ERROR neutron.agent.dhcp.agent Stdout: 2017-03-03 01:03:13.325 16177 ERROR neutron.agent.dhcp.agent Stderr: iptables-restore: line 9 failed 2017-03-03 01:03:13.325 16177 ERROR neutron.agent.dhcp.agent 2017-03-03 01:03:13.325 16177 ERROR neutron.agent.dhcp.agent
syslog shows that there was no dnsmasq restart between 00:56:38 and 01:27:09. So it looks like the iptables-restore failure caused the DHCP agent to bail completely on the setup that is needed to support a new VM IP.
Failed to SSH into 10.28.0.8:
10.28.0.8 is the IP just allocated to a newly created VM.
Observed during a test last night, with
- master networking-calico code
- Liberty OpenStack on CentOS 7.3.
VM was 'drb9q9', created at 01:03:10.
DHCP agent log for the relevant compute node shows an iptables-restore failure at 01:03:13:
2017-03-03 01:03:13,324 [23437552] (ERROR) utils: dhcp-agen- INPUT dhcp-agen- OUTPUT dhcp-agen- PREROUTING dhcp-agen- OUTPUT dhcp-agen- PREROUTING
Command: ['sudo', 'iptables-restore', '-n']
Exit code: 1
Stdin: # Generated by iptables_manager
*filter
-D FORWARD 1
-I FORWARD 3 -j felix-FORWARD
-I INPUT 1 -j calico-
-D INPUT 3
-D OUTPUT 1
-I OUTPUT 3 -j felix-OUTPUT
COMMIT
# Completed by iptables_manager
# Generated by iptables_manager
*nat
-I OUTPUT 1 -j calico-
-D OUTPUT 3
-D POSTROUTING 1
-I POSTROUTING 3 -j felix-POSTROUTING
-I PREROUTING 1 -j calico-
-D PREROUTING 3
COMMIT
# Completed by iptables_manager
# Generated by iptables_manager
*raw
-I OUTPUT 1 -j calico-
-D OUTPUT 3
-I PREROUTING 1 -j calico-
-D PREROUTING 3
COMMIT
# Completed by iptables_manager
Stdout:
Stderr: iptables-restore: line 9 failed
2017-03-03 01:03:13,325 [23437552] (ERROR) iptables_manager: IPTablesManager .apply failed to apply the following set of iptables rules: dhcp-agen- INPUT dhcp-agen- OUTPUT bbf1-48e9- 90d1-a4afa34e2f c6. agent.dhcp. agent Traceback (most recent call last): agent.dhcp. agent File "/usr/lib/ python2. 7/site- packages/ neutron/ agent/dhcp/ agent.py" , line 115, in call_driver agent.dhcp. agent getattr(driver, action) (**action_ kwargs) agent.dhcp. agent File "/usr/lib/ python2. 7/site- packages/ neutron/ agent/linux/ dhcp.py" , line 138, in restart agent.dhcp. agent self.enable() agent.dhcp. agent File "/usr/lib/ python2. 7/site- packages/ neutron/ agent/linux/ dhcp.py" , line 206, in enable agent.dhcp. agent interface_name = self.device_ manager. setup(self. network) agent.dhcp. agent File "/usr/lib/ python2. 7/site- packages/ neutron/ agent/linux/ dhcp.py" , line 1233, in setup agent.dhcp. agent self.fill_ dhcp_udp_ checksums( namespace= network. namespace) agent.dhcp. agent File "/usr/lib/ python2. 7/site- packages/ neutron/ agent/linux/ dhcp.py" , line 1299, in fill_dhcp_ udp_checksums agent.dhcp. agent iptables_ mgr.apply( ) agent.dhcp. agent File "/usr/lib/ python2. 7/site- packages/ neutron/ agent/linux/ iptables_ manager. py", line 423, in apply agent.dhcp. agent return self._apply() agent.dhcp. agent File "/usr/lib/ python2. 7/site- packages/ neutron/ agent/linux/ iptables_ manager. py", line 431, in _apply agent.dhcp. agent return self._apply_ synchronized( ) agent.dhcp. agent File "/usr/lib/ python2. 7/site- packages/ neutron/ agent/linux/ iptables_ manager. py", line 508, in _apply_synchronized agent.dhcp. agent '\n'.join( log_lines) ) agent.dhcp. agent File "/usr/lib/ python2. 7/site- packages/ oslo_utils/ excutils. py", line 195, in __exit__ agent.dhcp. agent six.reraise( self.type_ , self.value, self.tb) agent.dhcp. agent File "/usr/lib/ python2. 7/site- packages/ neutron/ agent/linux/ iptables_ manager. py", line 487, in _apply_synchronized agent.dhcp. agent run_as_root=True) agent.dhcp. agent File "/usr/lib/ python2. 7/site- packages/ neutron/ agent/linux/ utils.py" , line 159, in execute agent.dhcp. agent raise RuntimeError(m) agent.dhcp. agent RuntimeError: agent.dhcp. agent Command: ['sudo', 'iptables-restore', '-n'] agent.dhcp. agent Exit code: 1 agent.dhcp. agent Stdin: # Generated by iptables_manager agent.dhcp. agent *filter agent.dhcp. agent -D FORWARD 1 agent.dhcp. agent -I FORWARD 3 -j felix-FORWARD agent.dhcp. agent -I INPUT 1 -j calico- dhcp-agen- INPUT agent.dhcp. agent -D INPUT 3 agent.dhcp. agent -D OUTPUT 1 agent.dhcp. agent -I OUTPUT 3 -j felix-OUTPUT agent.dhcp. agent COMMIT agent.dhcp. agent # Completed by iptables_manager agent.dhcp. agent # Generated by iptables_manager agent.dhcp. agent *nat agent.dhcp. agent -I OUTPUT 1 -j calico- dhcp-agen- OUTPUT agent.dhcp. agent -D OUTPUT 3 agent.dhcp. agent -D POSTROUTING 1 agent.dhcp. agent -I POSTROUTING 3 -j felix-POSTROUTING agent.dhcp. agent -I PREROUTING 1 -j calico- dhcp-agen- PREROUTING agent.dhcp. agent -D PREROUTING 3 agent.dhcp. agent COMMIT agent.dhcp. agent # Completed by iptables_manager agent.dhcp. agent # Generated by iptables_manager agent.dhcp. agent *raw agent.dhcp. agent -I OUTPUT 1 -j calico- dhcp-agen- OUTPUT agent.dhcp. agent -D OUTPUT 3 agent.dhcp. agent -I PREROUTING 1 -j calico- dhcp-agen- PREROUTING agent.dhcp. agent -D PREROUTING 3 agent.dhcp. agent COMMIT agent.dhcp. agent # Completed by iptables_manager agent.dhcp. agent agent.dhcp. agent Stdout: agent.dhcp. agent Stderr: iptables-restore: line 9 failed agent.dhcp. agent agent.dhcp. agent
5. -I INPUT 1 -j calico-
6. -D INPUT 3
7. -D OUTPUT 1
8. -I OUTPUT 3 -j felix-OUTPUT
9. COMMIT
10. # Completed by iptables_manager
11. # Generated by iptables_manager
12. *nat
13. -I OUTPUT 1 -j calico-
14. -D OUTPUT 3
2017-03-03 01:03:13,325 [23437552] (ERROR) agent: Unable to restart dhcp for 4e0cb6bb-
2017-03-03 01:03:13.325 16177 ERROR neutron.
2017-03-03 01:03:13.325 16177 ERROR neutron.
2017-03-03 01:03:13.325 16177 ERROR neutron.
2017-03-03 01:03:13.325 16177 ERROR neutron.
2017-03-03 01:03:13.325 16177 ERROR neutron.
2017-03-03 01:03:13.325 16177 ERROR neutron.
2017-03-03 01:03:13.325 16177 ERROR neutron.
2017-03-03 01:03:13.325 16177 ERROR neutron.
2017-03-03 01:03:13.325 16177 ERROR neutron.
2017-03-03 01:03:13.325 16177 ERROR neutron.
2017-03-03 01:03:13.325 16177 ERROR neutron.
2017-03-03 01:03:13.325 16177 ERROR neutron.
2017-03-03 01:03:13.325 16177 ERROR neutron.
2017-03-03 01:03:13.325 16177 ERROR neutron.
2017-03-03 01:03:13.325 16177 ERROR neutron.
2017-03-03 01:03:13.325 16177 ERROR neutron.
2017-03-03 01:03:13.325 16177 ERROR neutron.
2017-03-03 01:03:13.325 16177 ERROR neutron.
2017-03-03 01:03:13.325 16177 ERROR neutron.
2017-03-03 01:03:13.325 16177 ERROR neutron.
2017-03-03 01:03:13.325 16177 ERROR neutron.
2017-03-03 01:03:13.325 16177 ERROR neutron.
2017-03-03 01:03:13.325 16177 ERROR neutron.
2017-03-03 01:03:13.325 16177 ERROR neutron.
2017-03-03 01:03:13.325 16177 ERROR neutron.
2017-03-03 01:03:13.325 16177 ERROR neutron.
2017-03-03 01:03:13.325 16177 ERROR neutron.
2017-03-03 01:03:13.325 16177 ERROR neutron.
2017-03-03 01:03:13.325 16177 ERROR neutron.
2017-03-03 01:03:13.325 16177 ERROR neutron.
2017-03-03 01:03:13.325 16177 ERROR neutron.
2017-03-03 01:03:13.325 16177 ERROR neutron.
2017-03-03 01:03:13.325 16177 ERROR neutron.
2017-03-03 01:03:13.325 16177 ERROR neutron.
2017-03-03 01:03:13.325 16177 ERROR neutron.
2017-03-03 01:03:13.325 16177 ERROR neutron.
2017-03-03 01:03:13.325 16177 ERROR neutron.
2017-03-03 01:03:13.325 16177 ERROR neutron.
2017-03-03 01:03:13.325 16177 ERROR neutron.
2017-03-03 01:03:13.325 16177 ERROR neutron.
2017-03-03 01:03:13.325 16177 ERROR neutron.
2017-03-03 01:03:13.325 16177 ERROR neutron.
2017-03-03 01:03:13.325 16177 ERROR neutron.
2017-03-03 01:03:13.325 16177 ERROR neutron.
2017-03-03 01:03:13.325 16177 ERROR neutron.
2017-03-03 01:03:13.325 16177 ERROR neutron.
2017-03-03 01:03:13.325 16177 ERROR neutron.
2017-03-03 01:03:13.325 16177 ERROR neutron.
2017-03-03 01:03:13.325 16177 ERROR neutron.
2017-03-03 01:03:13.325 16177 ERROR neutron.
2017-03-03 01:03:13.325 16177 ERROR neutron.
2017-03-03 01:03:13.325 16177 ERROR neutron.
2017-03-03 01:03:13.325 16177 ERROR neutron.
2017-03-03 01:03:13.325 16177 ERROR neutron.
2017-03-03 01:03:13.325 16177 ERROR neutron.
2017-03-03 01:03:13.325 16177 ERROR neutron.
2017-03-03 01:03:13.325 16177 ERROR neutron.
2017-03-03 01:03:13.325 16177 ERROR neutron.
2017-03-03 01:03:13.325 16177 ERROR neutron.
syslog shows that there was no dnsmasq restart between 00:56:38 and 01:27:09. So it looks like the iptables-restore failure caused the DHCP agent to bail completely on the setup that is needed to support a new VM IP.