I today compared logs from the failed (https://storage.gra.cloud.ovh.net/v1/AUTH_dcaab5e32b234d56b626f72581e3644c/zuul_opendev_logs_f76/825521/3/gate/neutron-functional-with-uwsgi/f76cef6/controller/logs/dsvm-functional-logs/neutron.tests.functional.plugins.ml2.drivers.ovn.mech_driver.ovsdb.test_ovsdb_monitor.TestNBDbMonitorOverTcp.test_floatingip_mac_bindings/) and good (https://storage.gra.cloud.ovh.net/v1/AUTH_dcaab5e32b234d56b626f72581e3644c/zuul_opendev_logs_f76/825521/3/gate/neutron-functional-with-uwsgi/f76cef6/controller/logs/dsvm-functional-logs/neutron.tests.functional.plugins.ml2.drivers.ovn.mech_driver.ovsdb.test_ovsdb_monitor.TestNBDbMonitor.test_floatingip_mac_bindings/) runs. Here's what I found: - In the SB logs, there is difference like: * in good run we have: 2022-01-23T19:28:26.309Z|00140|jsonrpc|DBG|unix#3: send notification, method="update2", params=["a220fbde-7c82-11ec-9a03-4fce359daaf8",{"Datapath_Binding":{"2c98db7e-84ad-4782-9d87-24eafae1700e":{"insert":{"tunnel_key":1,"external_ids":["map",[["logical-switch","5b801987-900b-4b79-aa3b-05c5fdc422eb"],["name","neutron-2d726047-4990-446b-9f1b-9dc6403adb3e"],["name2","network1"]]]}}}}] 2022-01-23T19:28:26.330Z|00141|poll_loop|DBG|wakeup due to [POLLIN] on fd 17 (/tmp/tmp8aro35_w/ovnsb_db.sock<->) at lib/stream-fd.c:157 2022-01-23T19:28:26.330Z|00142|jsonrpc|DBG|unix#3: received request, method="transact", params=["OVN_Southbound",{"uuid-name":"rowdebd1810_12e4_4ae8_9bcd_a18781f1b986","table":"MAC_Binding","op":"insert","row":{"ip":"100.0.0.21","datapath":["uuid","2c98db7e-84ad-4782-9d87-24eafae1700e"]}}], id=1260 2022-01-23T19:28:26.330Z|00143|jsonrpc|DBG|unix#3: send reply, result=[{"uuid":["uuid","e514d19c-7371-4d1b-9958-25ffd612b68b"]}], id=1260 2022-01-23T19:28:26.330Z|00144|jsonrpc|DBG|unix#4: send notification, method="update3", params=[["monid","OVN_Southbound"],"00000000-0000-0000-0000-000000000000",{"MAC_Binding":{"e514d19c-7371-4d1b-9958-25ffd612b68b":{"insert":{"ip":"100.0.0.21","datapath":["uuid","2c98db7e-84ad-4782-9d87-24eafae1700e"]}}}}] 2022-01-23T19:28:26.460Z|00145|poll_loop|DBG|wakeup due to [POLLIN] on fd 18 (/tmp/tmp8aro35_w/ovnsb_db.sock<->) at lib/stream-fd.c:157 but in failed run there is only: 2022-01-23T19:28:46.868Z|00156|jsonrpc|DBG|tcp:127.0.0.1:33296: send notification, method="update2", params=["ae7bd9c6-7c82-11ec-9a03-4fce359daaf8",{"Connection":{"83c29962-a6e2-4f3c-9fc7-888cf46540ed":{"modify":{"is_connected":true,"status":["map",[["bound_port","42597"],["n_connections","2"],["sec_since_connect","0"],["sec_since_disconnect","0"]]]}}},"Datapath_Binding":{"15003f80-2b91-44d6-8b9d-f8d217285fb3":{"insert":{"tunnel_key":1,"external_ids":["map",[["logical-switch","70caa880-3fed-4f9b-9558-f624c0d2c469"],["name","neutron-51ac15e4-24cc-4348-85c3-fe3a51cfcaec"],["name2","network1"]]]}}}}] 2022-01-23T19:28:49.369Z|00157|poll_loop|DBG|wakeup due to 2500-ms timeout at ovsdb/ovsdb-server.c:296 (0% CPU usage) 2022-01-23T19:28:51.872Z|00158|poll_loop|DBG|wakeup due to 2503-ms timeout at ovsdb/ovsdb-server.c:296 (0% CPU usage) 2022-01-23T19:28:54.373Z|00159|poll_loop|DBG|wakeup due to 2501-ms timeout at ovsdb/ovsdb-server.c:296 (0% CPU usage) and nothing else. In the NB log, there is: * good run: 2022-01-23T19:28:26.457Z|00144|jsonrpc|DBG|unix#5: send notification, method="update2", params=["a220fbdc-7c82-11ec-9a03-4fce359daaf8",{"Logical_Switch":{"16b0beb1-7c31-40dd-8c76-1ece7d81d20b":{"insert":{"name":"neutron-c314b414-279d-4457-a8d2-eeab1d2d78ad","external_ids":["map",[["neutron:availability_zone_hints",""],["neutron:mtu","1442"],["neutron:network_name","net1"],["neutron:revision_number","1"]]],"other_config":["map",[["mcast_flood_unregistered","false"],["mcast_snoop","false"],["vlan-passthru","false"]]]}}}}] 2022-01-23T19:28:26.457Z|00145|jsonrpc|DBG|unix#5: send reply, result=[{"uuid":["uuid","16b0beb1-7c31-40dd-8c76-1ece7d81d20b"]}], id=1261 2022-01-23T19:28:26.457Z|00146|jsonrpc|DBG|unix#6: send notification, method="update3", params=[["monid","OVN_Northbound"],"00000000-0000-0000-0000-000000000000",{"Logical_Switch":{"16b0beb1-7c31-40dd-8c76-1ece7d81d20b":{"insert":{"name":"neutron-c314b414-279d-4457-a8d2-eeab1d2d78ad","external_ids":["map",[["neutron:availability_zone_hints",""],["neutron:mtu","1442"],["neutron:network_name","net1"],["neutron:revision_number","1"]]],"other_config":["map",[["mcast_flood_unregistered","false"],["mcast_snoop","false"],["vlan-passthru","false"]]]}}}}] 2022-01-23T19:28:26.769Z|00147|poll_loop|DBG|wakeup due to [POLLIN] on fd 17 (/tmp/tmp8aro35_w/ovnnb_db.sock<->) at lib/stream-fd.c:157 (0% CPU usage) 2022-01-23T19:28:26.769Z|00148|jsonrpc|DBG|unix#5: received request, method="transact", params=["OVN_Northbound",{"uuid-name":"row157d7b26_ebea_47d7_8917_161d2f11a246","table":"DHCP_Options","op":"insert","row":{"options":["map",[["dns_server","{10.10.10.10}"],["lease_time","43200"],["mtu","1442"],["router","20.0.0.1"],["server_id","20.0.0.1"],["server_mac","12:34:56:78:31:3f"]]],"cidr":"20.0.0.0/24","external_ids":["map",[["neutron:revision_number","0"],["subnet_id","2cf02242-c45f-4ccd-8219-7268a94379e7"]]]}}], id=1262 2022-01-23T19:28:26.769Z|00149|jsonrpc|DBG|unix#5: send notification, method="update2", params=["a220fbdc-7c82-11ec-9a03-4fce359daaf8",{"DHCP_Options":{"76a8f045-8669-46d5-a50b-cd391544efbe":{"insert":{"options":["map",[["dns_server","{10.10.10.10}"],["lease_time","43200"],["mtu","1442"],["router","20.0.0.1"],["server_id","20.0.0.1"],["server_mac","12:34:56:78:31:3f"]]],"cidr":"20.0.0.0/24","external_ids":["map",[["neutron:revision_number","0"],["subnet_id","2cf02242-c45f-4ccd-8219-7268a94379e7"]]]}}}}] 2022-01-23T19:28:26.769Z|00150|jsonrpc|DBG|unix#5: send reply, result=[{"uuid":["uuid","76a8f045-8669-46d5-a50b-cd391544efbe"]}], id=1262 2022-01-23T19:28:26.770Z|00151|jsonrpc|DBG|unix#6: send notification, method="update3", params=[["monid","OVN_Northbound"],"00000000-0000-0000-0000-000000000000",{"DHCP_Options":{"76a8f045-8669-46d5-a50b-cd391544efbe":{"insert":{"options":["map",[["dns_server","{10.10.10.10}"],["lease_time","43200"],["mtu","1442"],["router","20.0.0.1"],["server_id","20.0.0.1"],["server_mac","12:34:56:78:31:3f"]]],"cidr":"20.0.0.0/24","external_ids":["map",[["neutron:revision_number","0"],["subnet_id","2cf02242-c45f-4ccd-8219-7268a94379e7"]]]}}}}] 2022-01-23T19:28:27.314Z|00152|poll_loop|DBG|wakeup due to [POLLIN] on fd 17 (/tmp/tmp8aro35_w/ovnnb_db.sock<->) at lib/stream-fd.c:157 (0% CPU usage) * but in failed run there is only: 2022-01-23T19:28:46.862Z|00169|jsonrpc|DBG|tcp:127.0.0.1:60220: send notification, method="update3", params=[["monid","OVN_Northbound"],"00000000-0000-0000-0000-000000000000",{"Logical_Switch":{"70caa880-3fed-4f9b-9558-f624c0d2c469":{"insert":{"name":"neutron-51ac15e4-24cc-4348-85c3-fe3a51cfcaec","external_ids":["map",[["neutron:availability_zone_hints",""],["neutron:mtu","1442"],["neutron:network_name","network1"],["neutron:revision_number","1"]]],"other_config":["map",[["mcast_flood_unregistered","false"],["mcast_snoop","false"],["vlan-passthru","false"]]]}}}}] 2022-01-23T19:28:49.128Z|00170|poll_loop|DBG|wakeup due to 2266-ms timeout at ovsdb/ovsdb-server.c:296 (0% CPU usage) 2022-01-23T19:28:51.629Z|00171|poll_loop|DBG|wakeup due to 2501-ms timeout at ovsdb/ovsdb-server.c:296 (0% CPU usage) 2022-01-23T19:28:54.130Z|00172|poll_loop|DBG|wakeup due to 2501-ms timeout at ovsdb/ovsdb-server.c:296 From other things which I noticed, in the failed test testrun log, there is error like: 2022-01-23 19:28:46.875 54853 ERROR ovsdbapp.backend.ovs_idl.vlog [-] tcp:127.0.0.1:42597: error parsing update: ovsdb error: Modify non-existing row: ovs.db.error.Error: ovsdb error: Modify non-existing row Just after network was created. In the good runs, similar error is also sometimes visible but it's later during the test. If that could be root cause, then it seems like some kind of race condition but I'm not sure about that really.