Comment 2 for bug 1955008

Revision history for this message
Slawek Kaplonski (slaweq) wrote : Re: [functional] Failure of "test_floatingip_mac_bindings"

I today compared logs from the failed (https://storage.gra.cloud.ovh.net/v1/AUTH_dcaab5e32b234d56b626f72581e3644c/zuul_opendev_logs_f76/825521/3/gate/neutron-functional-with-uwsgi/f76cef6/controller/logs/dsvm-functional-logs/neutron.tests.functional.plugins.ml2.drivers.ovn.mech_driver.ovsdb.test_ovsdb_monitor.TestNBDbMonitorOverTcp.test_floatingip_mac_bindings/) and good (https://storage.gra.cloud.ovh.net/v1/AUTH_dcaab5e32b234d56b626f72581e3644c/zuul_opendev_logs_f76/825521/3/gate/neutron-functional-with-uwsgi/f76cef6/controller/logs/dsvm-functional-logs/neutron.tests.functional.plugins.ml2.drivers.ovn.mech_driver.ovsdb.test_ovsdb_monitor.TestNBDbMonitor.test_floatingip_mac_bindings/) runs. Here's what I found:

- In the SB logs, there is difference like:

* in good run we have:
2022-01-23T19:28:26.309Z|00140|jsonrpc|DBG|unix#3: send notification, method="update2", params=["a220fbde-7c82-11ec-9a03-4fce359daaf8",{"Datapath_Binding":{"2c98db7e-84ad-4782-9d87-24eafae1700e":{"insert":{"tunnel_key":1,"external_ids":["map",[["logical-switch","5b801987-900b-4b79-aa3b-05c5fdc422eb"],["name","neutron-2d726047-4990-446b-9f1b-9dc6403adb3e"],["name2","network1"]]]}}}}]
2022-01-23T19:28:26.330Z|00141|poll_loop|DBG|wakeup due to [POLLIN] on fd 17 (/tmp/tmp8aro35_w/ovnsb_db.sock<->) at lib/stream-fd.c:157
2022-01-23T19:28:26.330Z|00142|jsonrpc|DBG|unix#3: received request, method="transact", params=["OVN_Southbound",{"uuid-name":"rowdebd1810_12e4_4ae8_9bcd_a18781f1b986","table":"MAC_Binding","op":"insert","row":{"ip":"100.0.0.21","datapath":["uuid","2c98db7e-84ad-4782-9d87-24eafae1700e"]}}], id=1260
2022-01-23T19:28:26.330Z|00143|jsonrpc|DBG|unix#3: send reply, result=[{"uuid":["uuid","e514d19c-7371-4d1b-9958-25ffd612b68b"]}], id=1260
2022-01-23T19:28:26.330Z|00144|jsonrpc|DBG|unix#4: send notification, method="update3", params=[["monid","OVN_Southbound"],"00000000-0000-0000-0000-000000000000",{"MAC_Binding":{"e514d19c-7371-4d1b-9958-25ffd612b68b":{"insert":{"ip":"100.0.0.21","datapath":["uuid","2c98db7e-84ad-4782-9d87-24eafae1700e"]}}}}]
2022-01-23T19:28:26.460Z|00145|poll_loop|DBG|wakeup due to [POLLIN] on fd 18 (/tmp/tmp8aro35_w/ovnsb_db.sock<->) at lib/stream-fd.c:157

but in failed run there is only:

2022-01-23T19:28:46.868Z|00156|jsonrpc|DBG|tcp:127.0.0.1:33296: send notification, method="update2", params=["ae7bd9c6-7c82-11ec-9a03-4fce359daaf8",{"Connection":{"83c29962-a6e2-4f3c-9fc7-888cf46540ed":{"modify":{"is_connected":true,"status":["map",[["bound_port","42597"],["n_connections","2"],["sec_since_connect","0"],["sec_since_disconnect","0"]]]}}},"Datapath_Binding":{"15003f80-2b91-44d6-8b9d-f8d217285fb3":{"insert":{"tunnel_key":1,"external_ids":["map",[["logical-switch","70caa880-3fed-4f9b-9558-f624c0d2c469"],["name","neutron-51ac15e4-24cc-4348-85c3-fe3a51cfcaec"],["name2","network1"]]]}}}}]
2022-01-23T19:28:49.369Z|00157|poll_loop|DBG|wakeup due to 2500-ms timeout at ovsdb/ovsdb-server.c:296 (0% CPU usage)
2022-01-23T19:28:51.872Z|00158|poll_loop|DBG|wakeup due to 2503-ms timeout at ovsdb/ovsdb-server.c:296 (0% CPU usage)
2022-01-23T19:28:54.373Z|00159|poll_loop|DBG|wakeup due to 2501-ms timeout at ovsdb/ovsdb-server.c:296 (0% CPU usage)

and nothing else.

In the NB log, there is:

* good run:

2022-01-23T19:28:26.457Z|00144|jsonrpc|DBG|unix#5: send notification, method="update2", params=["a220fbdc-7c82-11ec-9a03-4fce359daaf8",{"Logical_Switch":{"16b0beb1-7c31-40dd-8c76-1ece7d81d20b":{"insert":{"name":"neutron-c314b414-279d-4457-a8d2-eeab1d2d78ad","external_ids":["map",[["neutron:availability_zone_hints",""],["neutron:mtu","1442"],["neutron:network_name","net1"],["neutron:revision_number","1"]]],"other_config":["map",[["mcast_flood_unregistered","false"],["mcast_snoop","false"],["vlan-passthru","false"]]]}}}}]
2022-01-23T19:28:26.457Z|00145|jsonrpc|DBG|unix#5: send reply, result=[{"uuid":["uuid","16b0beb1-7c31-40dd-8c76-1ece7d81d20b"]}], id=1261
2022-01-23T19:28:26.457Z|00146|jsonrpc|DBG|unix#6: send notification, method="update3", params=[["monid","OVN_Northbound"],"00000000-0000-0000-0000-000000000000",{"Logical_Switch":{"16b0beb1-7c31-40dd-8c76-1ece7d81d20b":{"insert":{"name":"neutron-c314b414-279d-4457-a8d2-eeab1d2d78ad","external_ids":["map",[["neutron:availability_zone_hints",""],["neutron:mtu","1442"],["neutron:network_name","net1"],["neutron:revision_number","1"]]],"other_config":["map",[["mcast_flood_unregistered","false"],["mcast_snoop","false"],["vlan-passthru","false"]]]}}}}]
2022-01-23T19:28:26.769Z|00147|poll_loop|DBG|wakeup due to [POLLIN] on fd 17 (/tmp/tmp8aro35_w/ovnnb_db.sock<->) at lib/stream-fd.c:157 (0% CPU usage)
2022-01-23T19:28:26.769Z|00148|jsonrpc|DBG|unix#5: received request, method="transact", params=["OVN_Northbound",{"uuid-name":"row157d7b26_ebea_47d7_8917_161d2f11a246","table":"DHCP_Options","op":"insert","row":{"options":["map",[["dns_server","{10.10.10.10}"],["lease_time","43200"],["mtu","1442"],["router","20.0.0.1"],["server_id","20.0.0.1"],["server_mac","12:34:56:78:31:3f"]]],"cidr":"20.0.0.0/24","external_ids":["map",[["neutron:revision_number","0"],["subnet_id","2cf02242-c45f-4ccd-8219-7268a94379e7"]]]}}], id=1262
2022-01-23T19:28:26.769Z|00149|jsonrpc|DBG|unix#5: send notification, method="update2", params=["a220fbdc-7c82-11ec-9a03-4fce359daaf8",{"DHCP_Options":{"76a8f045-8669-46d5-a50b-cd391544efbe":{"insert":{"options":["map",[["dns_server","{10.10.10.10}"],["lease_time","43200"],["mtu","1442"],["router","20.0.0.1"],["server_id","20.0.0.1"],["server_mac","12:34:56:78:31:3f"]]],"cidr":"20.0.0.0/24","external_ids":["map",[["neutron:revision_number","0"],["subnet_id","2cf02242-c45f-4ccd-8219-7268a94379e7"]]]}}}}]
2022-01-23T19:28:26.769Z|00150|jsonrpc|DBG|unix#5: send reply, result=[{"uuid":["uuid","76a8f045-8669-46d5-a50b-cd391544efbe"]}], id=1262
2022-01-23T19:28:26.770Z|00151|jsonrpc|DBG|unix#6: send notification, method="update3", params=[["monid","OVN_Northbound"],"00000000-0000-0000-0000-000000000000",{"DHCP_Options":{"76a8f045-8669-46d5-a50b-cd391544efbe":{"insert":{"options":["map",[["dns_server","{10.10.10.10}"],["lease_time","43200"],["mtu","1442"],["router","20.0.0.1"],["server_id","20.0.0.1"],["server_mac","12:34:56:78:31:3f"]]],"cidr":"20.0.0.0/24","external_ids":["map",[["neutron:revision_number","0"],["subnet_id","2cf02242-c45f-4ccd-8219-7268a94379e7"]]]}}}}]
2022-01-23T19:28:27.314Z|00152|poll_loop|DBG|wakeup due to [POLLIN] on fd 17 (/tmp/tmp8aro35_w/ovnnb_db.sock<->) at lib/stream-fd.c:157 (0% CPU usage)

* but in failed run there is only:

2022-01-23T19:28:46.862Z|00169|jsonrpc|DBG|tcp:127.0.0.1:60220: send notification, method="update3", params=[["monid","OVN_Northbound"],"00000000-0000-0000-0000-000000000000",{"Logical_Switch":{"70caa880-3fed-4f9b-9558-f624c0d2c469":{"insert":{"name":"neutron-51ac15e4-24cc-4348-85c3-fe3a51cfcaec","external_ids":["map",[["neutron:availability_zone_hints",""],["neutron:mtu","1442"],["neutron:network_name","network1"],["neutron:revision_number","1"]]],"other_config":["map",[["mcast_flood_unregistered","false"],["mcast_snoop","false"],["vlan-passthru","false"]]]}}}}]
2022-01-23T19:28:49.128Z|00170|poll_loop|DBG|wakeup due to 2266-ms timeout at ovsdb/ovsdb-server.c:296 (0% CPU usage)
2022-01-23T19:28:51.629Z|00171|poll_loop|DBG|wakeup due to 2501-ms timeout at ovsdb/ovsdb-server.c:296 (0% CPU usage)
2022-01-23T19:28:54.130Z|00172|poll_loop|DBG|wakeup due to 2501-ms timeout at ovsdb/ovsdb-server.c:296

From other things which I noticed, in the failed test testrun log, there is error like:

2022-01-23 19:28:46.875 54853 ERROR ovsdbapp.backend.ovs_idl.vlog [-] tcp:127.0.0.1:42597: error parsing update: ovsdb error: Modify non-existing row: ovs.db.error.Error: ovsdb error: Modify non-existing row

Just after network was created. In the good runs, similar error is also sometimes visible but it's later during the test.
If that could be root cause, then it seems like some kind of race condition but I'm not sure about that really.