Investigate tempest failures when using security groups
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
networking-sfc |
New
|
Undecided
|
Unassigned |
Bug Description
To fix tempest gates for Rocky, tests were updated to have port security disabled in https:/
But this is a workaround, we should investigate why tests with a wildcard security group stopped working in that cycle.
To reproduce, revert the networking_
Possible root causes: changes in security group defaults, switch to ovsfw?
Pasting information from previous bug:
Since around Rocky mid-cycle, the tempest gates always fail on all tests. Sample failure:
http://
VMs creation looks OK, but the test fails when trying to connect to a VM and run traceroute to the other:
2018-07-23 17:56:56.323 6755 INFO tempest.
2018-07-23 17:56:56.333 6755 INFO paramiko.transport [-] Connected (version 2.0, client dropbear_2012.55)
2018-07-23 17:56:56.607 6755 INFO paramiko.transport [-] Authentication (publickey) successful!
2018-07-23 17:56:56.608 6755 INFO tempest.
2018-07-23 18:00:13.667 6755 ERROR tempest.
Details: Command: 'set -eu -o pipefail; PATH=$PATH:/sbin; traceroute -n -I 10.1.0.13' executed on host '172.24.5.20'.: TimeoutException: Request timed out
After some digging I suspect some security group issue, as I deployed a master devstack and manually tested SFC, still working fine. But I disable port security in my manual tests
While tempest test is running, I made a quick test and run "openstack port set --disable-
This allowed traceroute to finally report in:
traceroute to 10.0.0.5 (10.0.0.5), 30 hops max, 46 byte packets
1 * * *
2 * * *
3 * * *
4 * * *
5 * * *
6 * 10.0.0.5 2.316 ms 1.935 ms
2018-07-27 15:07:36,557 16774 ERROR [networking_
[u' 1 * * *', u' 2 * * *', u' 3 * * *', u' 4 * * *', u' 5 * * *']
vs
[[u'10.0.0.8']]
The first '* * *' were timeouts until I disabled port security
Also tweaking the code to run with port security disabled, all tests pass