Iptables jump to float-snat chain goes missing.

Bug #1218040 reported by Carl Baldwin
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
neutron
Fix Released
High
Carl Baldwin

Bug Description

We recently lost SNAT from our floating IPs. The reason for this seems to be that a jump from the snat chain to the float-snat chain goes missing when a router is processed.

For example, I have a devstack with two VMs. The floating IPs are 172.24.4.227 and 172.24.4.228. The router's default SNAT address is 172.24.4.226. When I ping from one to the other, I see the source SNAT is the default SNAT. This is the output of tcpdump on the router's internal interface.

19:13:42.552877 IP 10.0.0.3 > 172.24.4.228: ICMP echo request, id 16641, seq 5, length 64
19:13:42.552903 IP 172.24.4.226 > 10.0.0.4: ICMP echo request, id 16641, seq 5, length 64
19:13:42.553221 IP 10.0.0.4 > 172.24.4.226: ICMP echo reply, id 16641, seq 5, length 64
19:13:42.553230 IP 172.24.4.228 > 10.0.0.3: ICMP echo reply, id 16641, seq 5, length 64

I expect to see this instead:

19:18:06.046647 IP 10.0.0.3 > 172.24.4.228: ICMP echo request, id 17153, seq 0, length 64
19:18:06.056681 IP 172.24.4.227 > 10.0.0.4: ICMP echo request, id 17153, seq 0, length 64
19:18:06.067306 IP 10.0.0.4 > 172.24.4.227: ICMP echo reply, id 17153, seq 0, length 64
19:18:06.068098 IP 172.24.4.228 > 10.0.0.3: ICMP echo reply, id 17153, seq 0, length 64

When it is working, my router's snat chain looks like this:

Chain neutron-l3-agent-snat (1 references)
target prot opt source destination
neutron-l3-agent-float-snat all -- 0.0.0.0/0 0.0.0.0/0
SNAT all -- 10.0.0.0/24 0.0.0.0/0 to:172.24.4.226

When it is broken, it looks like this:
Chain neutron-l3-agent-snat (1 references)
target prot opt source destination
SNAT all -- 10.0.0.0/24 0.0.0.0/0 to:172.24.4.226

Tags: l3-ipam-dhcp
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (master)

Fix proposed to branch: master
Review: https://review.openstack.org/44133

Changed in neutron:
assignee: nobody → Carl Baldwin (carl-baldwin)
status: New → In Progress
description: updated
tags: added: l3-ipam-dhcp
Changed in neutron:
importance: Undecided → High
milestone: none → havana-3
Revision history for this message
Jian Wen (wenjianhn) wrote :

The following bug description is shorter and clearer.

An instance which has a floating ip address ping the floating ip address
of another instance, when the destination instance receives the packet,
the source ip address should be the floating ip address of the source instance
instead of the external gateway ip address of the router.

Revision history for this message
Salvatore Orlando (salvatore-orlando) wrote :

Carl, can you confirm this bug does not affect grizzly/stable.

I looked at the l3 agent, and could not find either a jump rule from 'snat' to 'float-snat'.
In that case we would like to backport this patch.

Revision history for this message
Carl Baldwin (carl-baldwin) wrote :

Salvatore, good question. We have moved beyond stable in our development and so I do not have a stable development environment to test in handy. For us, this bug started with the following patch set.

https://review.openstack.org/#/c/34345/

This was introduced in the Havana time frame.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (master)

Reviewed: https://review.openstack.org/44133
Committed: http://github.com/openstack/neutron/commit/169729cd114603b355faf5effd08ea660b81551f
Submitter: Jenkins
Branch: master

commit 169729cd114603b355faf5effd08ea660b81551f
Author: Carl Baldwin <email address hidden>
Date: Wed Aug 28 19:32:34 2013 +0000

    Add jump to float-snat chain after clearing snat chain

    Clearing the chain in this code eliminates the rule to jump to the
    floating-snat chain. This is the simplest way to get it working
    again.

    Change-Id: Ic1818e10bd64170b6f0a2f52af8dc0814d7e04e0
    Fixes: Bug #1218040

Changed in neutron:
status: In Progress → Fix Committed
Thierry Carrez (ttx)
Changed in neutron:
status: Fix Committed → Fix Released
Thierry Carrez (ttx)
Changed in neutron:
milestone: havana-3 → 2013.2
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.