Bug #2015065 “[tempest] “test_security_group_rules_create” unsta...” : Bugs : neutron

Revision history for this message

Bence Romsics (bence-romsics) wrote on 2023-04-03:

#1

Frequencies:

11 occurrences out of the last 100 job runs:
logsearch log --project openstack/neutron --job neutron-ovs-grenade-dvr-multinode --limit 100 --file controller/logs/grenade.sh_log.txt 'test_security_group_rules_create .* FAILED'

15 occurrences out of the last 20 failed job runs:
logsearch log --project openstack/neutron --job neutron-ovs-grenade-dvr-multinode --limit 20 --result FAILURE --file controller/logs/grenade.sh_log.txt 'test_security_group_rules_create .* FAILED'

tags:	added: gate-failure
Changed in neutron:
status:	New → Confirmed
importance:	Undecided → Critical

Revision history for this message

Lajos Katona (lajos-katona) wrote on 2023-04-12:

#2

Opensearch link: https://opensearch.logs.openstack.org/_dashboards/app/discover/?security_tenant=global#/?_g=(filters:!(),refreshInterval:(pause:!t,value:0),time:(from:now-30d,to:now))&_a=(columns:!(build_status,build_name),filters:!(),index:'94869730-aea8-11ec-9e6a-83741af3fdcd',interval:auto,query:(language:kuery,query:'message:%22line%2067,%20in%20test_security_group_rules_create%22'),sort:!())

Revision history for this message

Lajos Katona (lajos-katona) wrote on 2023-04-12:

#3

I checked a few occurrances and one interesting thing is that these are (under tempest.api.compute.security_groups...) calling compute API os-security-group-rules, nova actually proxy these calls toward Neutron (via neutronclient).

In logs there is no sign of the issue with http timeout (i.e.: https://storage.bhs.cloud.ovh.net/v1/AUTH_dcaab5e32b234d56b626f72581e3644c/zuul_opendev_logs_5b6/880006/2/check/neutron-ovs-grenade-multinode/5b615cc/controller/logs/grenade.sh_log.txt )

Changed in neutron:
assignee:	nobody → Lajos Katona (lajos-katona)

Revision history for this message

Lajos Katona (lajos-katona) wrote on 2023-04-13:

#4

After checking some examples (another filter in opensearch:

https://opensearch.logs.openstack.org/_dashboards/app/discover/?security_tenant=global#/?_g=(filters:!(),refreshInterval:(pause:!t,value:0),time:(from:now-30d,to:now))&_a=(columns:!(build_status,build_name),filters:!(),index:'94869730-aea8-11ec-9e6a-83741af3fdcd',interval:auto,query:(language:kuery,query:'message:%22urllib3.exceptions.ReadTimeoutError:%20HTTPConnectionPool%22'),sort:!()) )

The issue with urllib3.exceptions.ReadTimeoutError does not appear for security-group API, but for other requests also (i.e.: test_add_remove_fixed_ip, https://10d0dca2ba7c7b5d77eb-05fbbc41f18c6e865b8d8381b314dfb6.ssl.cf1.rackcdn.com/871113/9/check/neutron-ovs-grenade-multinode/562bfa2/controller/logs/grenade.sh_log.txt )

or with server API: https://storage.bhs.cloud.ovh.net/v1/AUTH_dcaab5e32b234d56b626f72581e3644c/zuul_opendev_logs_b2e/860288/11/check/grenade-skip-level-always/b2eeb2d/job-output.txt

but the problem appears mostly with create_security_group_rule

Revision history for this message

Lajos Katona (lajos-katona) wrote on 2023-04-21:

#5

Similar failures also appear in tempest jobs also (please check the opensearch link in comment #4).

Revision history for this message

Rodolfo Alonso (rodolfo-alonso-hernandez) wrote on 2023-05-03:

#6

The same error is happening with "tempest.api.compute.servers.test_attach_interfaces.AttachInterfacesUnderV243Test.test_add_remove_fixed_ip"

Logs: https://storage.bhs.cloud.ovh.net/v1/AUTH_dcaab5e32b234d56b626f72581e3644c/zuul_opendev_logs_50b/879827/24/gate/neutron-ovs-grenade-dvr-multinode/50b5e2d/controller/logs/grenade.sh_log.txt

Revision history for this message

yatin (yatinkarel) wrote on 2023-05-17:

#7

Download full text (45.0 KiB)

Took a look on this and below are the findings:-

- The issue is seen across multiple projects and jobs and is a random one, found a related bug in nova created long back[1] but that too don't have a RCA.
- Both tls/non-tls jobs impacted across different jobs/project/branches[2].
- seeing the issue mostly in stable/zed+ i suspected dbcounter is related and to rule it out tested disabling it in https://review.opendev.org/c/openstack/neutron/+/883282 but the issue reproduced[5]
- I checked couple of job logs and seen 2 categories(stuck[3] vs taking long time[4] than 60seconds or 180seconds(in rally job)), the stuck ones seen only in nova while non-stuck ones across projects so these can be considered different issue and investigated seperately, this bug can focus on the stuck case.
- In some cases seen oslo_messaging disconnections, so not sure if that's the issue and if heartbeat_in_pthread=True will help in this or if it's an issue with eventlet/uwsgi threads monkey patching, i think someone from nova may have some idea over this.
- Next i would like to collect gmr if that can give some hint for the issue.

[1] https://bugs.launchpad.net/tempest/+bug/1999893
[2] HTTPSConnectionPool

master 93.4%
stable/zed 2.5%
stable/2023.1 1.6%
stable/wallaby 0.8%
stable/victoria 0.8%

nova-ceph-multistore 49.6%
glance-multistore-cinder-import 13.1%
tempest-ipv6-only 4.9%
cinder-tempest-plugin-lvm-tgt-barbican 4.1%
nova-next 2.5%

openstack/nova 49.2%
openstack/glance 15.6%
openstack/cinder-tempest-plugin 15.2%
openstack/tempest 6.6%
openstack/neutron 5.7%

HTTPConnectionPool

nova-grenade-multinode 48.0%
grenade 12.0%
neutron-ovs-grenade-dvr-multinode 12.0%
grenade-skip-level-always 8.0%
neutron-ovs-grenade-multinode 8.0%

master 72.0%
stable/2023.1 24.0%
stable/zed 4.0%

openstack/nova 56.0%
openstack/neutron 24.0%
openstack/cinder 8.0%
openstack/devstack 8.0%
openstack/tempest 4.0%

[3]
http://storage.gra.cloud.ovh.net/v1/AUTH_dcaab5e32b234d56b626f72581e3644c/zuul_opendev_logs_77c/877533/7/check/neutron-ovs-grenade-multinode/77c03cd/testr_results.html

May 09 11:41:36.941261 np0033988595 <email address hidden>[132073]: DEBUG nova.api.openstack.wsgi [None req-8656e772-7302-4b65-aa2e-515d84339e82 tempest-SecurityGroupRulesTestJSON-1968675211 tempest-SecurityGroupRulesTestJSON-1968675211-project-member] Action: 'create', calling method: <function Controller.__getattribute__.<locals>.version_select at 0x7f5b17edc940>, body: {"security_group_rule": {"parent_group_id": "3e848242-0f2c-4c47-8a97-fc4492ca00de", "ip_protocol": "icmp", "from_port": -1, "to_port": -1}} {{(pid=132073) _process_stack /opt/stack/new/nova/nova/api/openstack/wsgi.py:511}}

May 09 11:41:37.006089 np0033988595 neutron-server[126983]: INFO neutron.wsgi [req-8656e772-7302-4b65-aa2e-515d84339e82 req-ef556ac9-17c8-40b2-9569-56287f4e563e tempest-SecurityGroupRulesTestJSON-1968675211 tempest-SecurityGroupRulesTestJSON-1968675211-project-member] 10.209.38.241 "GET /v2.0/security-groups/3e848242-0f2c-4c47-8a97-fc4492ca00de HTTP/1.1" status: 200 len: 2415 time: 0.0520720

only security group check request on neutron side, no rule create request. nova worker stuck

https://657b14bfa6092f8bf722-e...

Took a look on this and below are the findings:-

- The issue is seen across multiple projects and jobs and is a random one, found a related bug in nova created long back[1] but that too don't have a RCA.
- Both tls/non-tls jobs impacted across different jobs/project/branches[2].
- seeing the issue mostly in stable/zed+ i suspected dbcounter is related and to rule it out tested disabling it in https://review.opendev.org/c/openstack/neutron/+/883282 but the issue reproduced[5]
- I checked couple of job logs and seen 2 categories(stuck[3] vs taking long time[4] than 60seconds or 180seconds(in rally job)), the stuck ones seen only in nova while non-stuck ones across projects so these can be considered different issue and investigated seperately, this bug can focus on the stuck case.
- In some cases seen oslo_messaging disconnections, so not sure if that's the issue and if heartbeat_in_pthread=True will help in this or if it's an issue with eventlet/uwsgi threads monkey patching, i think someone from nova may have some idea over this.
- Next i would like to collect gmr if that can give some hint for the issue.

[1] https://bugs.launchpad.net/tempest/+bug/1999893
[2] HTTPSConnectionPool

master 93.4%
stable/zed 2.5%
stable/2023.1 1.6%
stable/wallaby 0.8%
stable/victoria 0.8%

nova-ceph-multistore 49.6%
glance-multistore-cinder-import 13.1%
tempest-ipv6-only 4.9%
cinder-tempest-plugin-lvm-tgt-barbican 4.1%
nova-next 2.5%

openstack/nova 49.2%
openstack/glance 15.6%
openstack/cinder-tempest-plugin 15.2%
openstack/tempest 6.6%
openstack/neutron 5.7%

HTTPConnectionPool

nova-grenade-multinode 48.0%
grenade 12.0%
neutron-ovs-grenade-dvr-multinode 12.0%
grenade-skip-level-always 8.0%
neutron-ovs-grenade-multinode 8.0%

master 72.0%
stable/2023.1 24.0%
stable/zed 4.0%

openstack/nova 56.0%
openstack/neutron 24.0%
openstack/cinder 8.0%
openstack/devstack 8.0%
openstack/tempest 4.0%

[3]
http://storage.gra.cloud.ovh.net/v1/AUTH_dcaab5e32b234d56b626f72581e3644c/zuul_opendev_logs_77c/877533/7/check/neutron-ovs-grenade-multinode/77c03cd/testr_results.html

May 09 11:41:36.941261 np0033988595 devstack@n-api.service[132073]: DEBUG nova.api.openstack.wsgi [None req-8656e772-7302-4b65-aa2e-515d84339e82 tempest-SecurityGroupRulesTestJSON-1968675211 tempest-SecurityGroupRulesTestJSON-1968675211-project-member] Action: 'create', calling method: <function Controller.__getattribute__.<locals>.version_select at 0x7f5b17edc940>, body: {"security_group_rule": {"parent_group_id": "3e848242-0f2c-4c47-8a97-fc4492ca00de", "ip_protocol": "icmp", "from_port": -1, "to_port": -1}} {{(pid=132073) _process_stack /opt/stack/new/nova/nova/api/openstack/wsgi.py:511}}

May 09 11:41:37.006089 np0033988595 neutron-server[126983]: INFO neutron.wsgi [req-8656e772-7302-4b65-aa2e-515d84339e82 req-ef556ac9-17c8-40b2-9569-56287f4e563e tempest-SecurityGroupRulesTestJSON-1968675211 tempest-SecurityGroupRulesTestJSON-1968675211-project-member] 10.209.38.241 "GET /v2.0/security-groups/3e848242-0f2c-4c47-8a97-fc4492ca00de HTTP/1.1" status: 200  len: 2415 time: 0.0520720

only security group check request on neutron side, no rule create request. nova worker stuck

https://657b14bfa6092f8bf722-e48c4084f6218ce55719e5f9f078d786.ssl.cf1.rackcdn.com/880922/10/check/neutron-ovs-grenade-multinode/94b66d1/testr_results.html
May 09 04:06:11.739412 np0033985462 devstack@n-api.service[131773]: DEBUG nova.api.openstack.wsgi [None req-8a3e918b-d05c-4c41-8dcc-0801b63bb40f tempest-AttachInterfacesUnderV243Test-1757694796 tempest-AttachInterfacesUnderV243Test-1757694796-project-member] Action: 'create', calling method: <bound method ServersController.create of <nova.api.openstack.compute.servers.ServersController object at 0x7f2d99597070>>, body: {"server": {"name": "tempest-AttachInterfacesUnderV243Test-server-701831166", "imageRef": "4e6e788e-5b14-43e4-a3e9-70211eaebdc1", "flavorRef": "42", "networks": [{"uuid": "d04cc1d1-e49a-4750-ba9b-867cce6b95e8"}], "security_groups": [{"name": "tempest-securitygroup--319160416"}], "key_name": "tempest-keypair-1462561748", "user_data": "IyEvYmluL3NoCmVjaG8gIlByaW50aW5nIGNpcnJvcyB1c2VyIGF1dGhvcml6ZWQga2V5cyIKY2F0IH5jaXJyb3MvLnNzaC9hdXRob3JpemVkX2tleXMgfHwgdHJ1ZQo="}} {{(pid=131773) _process_stack /opt/stack/new/nova/nova/api/openstack/wsgi.py:511}}
May 09 04:06:12.120596 np0033985462 devstack@n-api.service[131773]: DEBUG nova.network.neutron [None req-8a3e918b-d05c-4c41-8dcc-0801b63bb40f tempest-AttachInterfacesUnderV243Test-1757694796 tempest-AttachInterfacesUnderV243Test-1757694796-project-member] validate_networks() for [('d04cc1d1-e49a-4750-ba9b-867cce6b95e8', None, None, None, None, None)] {{(pid=131773) validate_networks /opt/stack/new/nova/nova/network/neutron.py:2636}}

May 09 04:06:12.117865 np0033985462 neutron-server[126668]: INFO neutron.wsgi [req-8a3e918b-d05c-4c41-8dcc-0801b63bb40f req-07c1e85c-2678-40cb-9176-0858ba59db5f tempest-AttachInterfacesUnderV243Test-1757694796 tempest-AttachInterfacesUnderV243Test-1757694796-project-member] 10.208.224.100 "GET /v2.0/security-groups?name=tempest-securitygroup--319160416&fields=id&tenant_id=661a45a0177b42de9343c5ed415c9e9d HTTP/1.1" status: 200  len: 267 time: 0.0764890
May 09 04:06:12.509261 np0033985462 neutron-server[126668]: DEBUG neutron.pecan_wsgi.hooks.policy_enforcement [req-8a3e918b-d05c-4c41-8dcc-0801b63bb40f req-c546e282-a297-491b-b188-72fd250e86d9 tempest-AttachInterfacesUnderV243Test-1757694796 tempest-AttachInterfacesUnderV243Test-1757694796-project-member] Attributes excluded by policy engine: ['standard_attr_id', 'vlan_transparent', 'provider:network_type', 'provider:physical_network', 'provider:segmentation_id'] {{(pid=126668) _exclude_attributes_by_policy /opt/stack/new/neutron/neutron/pecan_wsgi/hooks/policy_enforcement.py:259}}
May 09 04:06:12.511040 np0033985462 neutron-server[126668]: INFO neutron.wsgi [req-8a3e918b-d05c-4c41-8dcc-0801b63bb40f req-c546e282-a297-491b-b188-72fd250e86d9 tempest-AttachInterfacesUnderV243Test-1757694796 tempest-AttachInterfacesUnderV243Test-1757694796-project-member] 10.208.224.100 "GET /v2.0/networks?id=d04cc1d1-e49a-4750-ba9b-867cce6b95e8 HTTP/1.1" status: 200  len: 835 time: 0.3812921
May 09 04:06:12.512348 np0033985462 neutron-server[126671]: DEBUG neutron.api.rpc.handlers.l3_rpc [None req-fe08a374-b246-41b9-85cf-00dd1e296771 None None] Sync routers for ids ['3f1303c4-6d4a-4eee-b4cf-b112f83ada03'] in np0033985462 {{(pid=126671) sync_routers /opt/stack/new/neutron/neutron/api/rpc/handlers/l3_rpc.py:125}}
May 09 04:06:12.533464 np0033985462 neutron-server[126668]: INFO neutron.wsgi [req-8a3e918b-d05c-4c41-8dcc-0801b63bb40f req-b1fce641-94f9-41de-a407-cd782d146f2e tempest-AttachInterfacesUnderV243Test-1757694796 tempest-AttachInterfacesUnderV243Test-1757694796-project-member] 10.208.224.100 "GET /v2.0/quotas/661a45a0177b42de9343c5ed415c9e9d HTTP/1.1" status: 200  len: 386 time: 0.0155635
May 09 04:06:13.256151 np0033985462 neutron-server[126668]: DEBUG neutron.pecan_wsgi.hooks.policy_enforcement [req-8a3e918b-d05c-4c41-8dcc-0801b63bb40f req-4db0172f-296b-45a1-a0e4-c98b48de5189 tempest-AttachInterfacesUnderV243Test-1757694796 tempest-AttachInterfacesUnderV243Test-1757694796-project-member] Attributes excluded by policy engine: ['network:tenant_id'] {{(pid=126668) _exclude_attributes_by_policy /opt/stack/new/neutron/neutron/pecan_wsgi/hooks/policy_enforcement.py:259}}
May 09 04:06:13.258682 np0033985462 neutron-server[126668]: DEBUG neutron.pecan_wsgi.hooks.policy_enforcement [req-8a3e918b-d05c-4c41-8dcc-0801b63bb40f req-4db0172f-296b-45a1-a0e4-c98b48de5189 tempest-AttachInterfacesUnderV243Test-1757694796 tempest-AttachInterfacesUnderV243Test-1757694796-project-member] Attributes excluded by policy engine: ['network:tenant_id'] {{(pid=126668) _exclude_attributes_by_policy /opt/stack/new/neutron/neutron/pecan_wsgi/hooks/policy_enforcement.py:259}}
May 09 04:06:13.260869 np0033985462 neutron-server[126668]: INFO neutron.wsgi [req-8a3e918b-d05c-4c41-8dcc-0801b63bb40f req-4db0172f-296b-45a1-a0e4-c98b48de5189 tempest-AttachInterfacesUnderV243Test-1757694796 tempest-AttachInterfacesUnderV243Test-1757694796-project-member] 10.208.224.100 "GET /v2.0/ports?tenant_id=661a45a0177b42de9343c5ed415c9e9d&fields=id HTTP/1.1" status: 200  len: 302 time: 0.6538901
May 09 04:06:13.445056 np0033985462 neutron-server[126668]: DEBUG neutron.wsgi [req-8a3e918b-d05c-4c41-8dcc-0801b63bb40f req-7ef16547-0a73-43e4-8828-e63bdfc243af service neutron] http://10.208.224.100:9696/v2.0/extensions returned with HTTP 200 {{(pid=126668) __call__ /opt/stack/new/neutron/neutron/wsgi.py:730}}
May 09 04:06:13.446407 np0033985462 neutron-server[126668]: INFO neutron.wsgi [req-8a3e918b-d05c-4c41-8dcc-0801b63bb40f req-7ef16547-0a73-43e4-8828-e63bdfc243af service neutron] 10.208.224.100 "GET /v2.0/extensions HTTP/1.1" status: 200  len: 15992 time: 0.0916190

A few requests hit neutron, nova process worker stuck, similar seen in https://storage.bhs.cloud.ovh.net/v1/AUTH_dcaab5e32b234d56b626f72581e3644c/zuul_opendev_logs_a7c/882237/1/check/neutron-ovs-grenade-multinode-skip-level/a7cb1fb/testr_results.html and https://storage.bhs.cloud.ovh.net/v1/AUTH_dcaab5e32b234d56b626f72581e3644c/zuul_opendev_logs_50b/879827/24/gate/neutron-ovs-grenade-dvr-multinode/50b5e2d/controller/logs/grenade.sh_log.txt

http://storage.bhs.cloud.ovh.net/v1/AUTH_dcaab5e32b234d56b626f72581e3644c/zuul_opendev_logs_bdb/882413/2/check/grenade/bdb653c/testr_results.html

May 10 03:52:08.355879 np0033996006 devstack@n-api.service[164244]: DEBUG nova.api.openstack.wsgi [None req-1abff25d-00fa-404e-9eb7-0e0f579353ff tempest-SecurityGroupRulesTestJSON-204236401 tempest-SecurityGroupRulesTestJSON-204236401-project-member] Action: 'create', calling method: <function Controller.__getattribute__.<locals>.version_select at 0x7f18ccda7370>, body: {"security_group": {"name": "tempest-SecurityGroupRulesTestJSON-securitygroup-260290696", "description": "tempest-description-1898479529"}} {{(pid=164244) _process_stack /opt/stack/new/nova/nova/api/openstack/wsgi.py:511}}
May 10 03:52:09.076862 np0033996006 devstack@n-api.service[164244]: INFO nova.api.openstack.requestlog [None req-1abff25d-00fa-404e-9eb7-0e0f579353ff tempest-SecurityGroupRulesTestJSON-204236401 tempest-SecurityGroupRulesTestJSON-204236401-project-member] 10.210.66.224 "POST /compute/v2.1/os-security-groups" status: 200 len: 247 microversion: 2.1 time: 0.844440

security group rule request stuck:-
May 10 03:52:09.093274 np0033996006 devstack@n-api.service[164243]: DEBUG nova.api.openstack.wsgi [None req-996455dc-0cef-4e81-ab7d-074e98359130 tempest-SecurityGroupRulesTestJSON-204236401 tempest-SecurityGroupRulesTestJSON-204236401-project-member] Action: 'create', calling method: <function Controller.__getattribute__.<locals>.version_select at 0x7f18c7f215a0>, body: {"security_group_rule": {"parent_group_id": "276fc8b9-0378-4eb6-8158-6c1974a69dd8", "ip_protocol": "tcp", "from_port": 22, "to_port": 22}} {{(pid=164243) _process_stack /opt/stack/new/nova/nova/api/openstack/wsgi.py:511}}

no more seeing process 164243 and request didn't hit neutron too.

https://bb3e1f5b43d0829aa6e7-5a795b017497694c1e83088b91e1e5fe.ssl.cf2.rackcdn.com/882860/1/gate/nova-grenade-multinode/2a15fd7/testr_results.html

May 12 05:26:44.856687 np0034017437 devstack@n-api.service[183467]: DEBUG nova.api.openstack.wsgi [None req-76f0e54a-5c5d-4d87-8a71-39de67b185e2 tempest-LiveMigrationTest-958305973 tempest-LiveMigrationTest-958305973-project-member] Action: 'create', calling method: <bound method ServersController.create of <nova.api.openstack.compute.servers.ServersController object at 0x7f6d8c65aa60>>, body: {"server": {"name": "tempest-LiveMigrationTest-server-1604307700", "imageRef": "", "flavorRef": "42", "networks": [{"uuid": "66d45708-1484-4ec8-861d-2a9d71cc55e3"}], "block_device_mapping_v2": [{"uuid": "dd069f8e-0e89-47cb-90b5-47f3cc953180", "source_type": "volume", "destination_type": "volume", "boot_index": 0, "delete_on_termination": true}]}} {{(pid=183467) _process_stack /opt/stack/new/nova/nova/api/openstack/wsgi.py:511}}

May 12 05:26:44.908475 np0034017437 devstack@n-api.service[183467]: DEBUG nova.network.neutron [None req-76f0e54a-5c5d-4d87-8a71-39de67b185e2 tempest-LiveMigrationTest-958305973 tempest-LiveMigrationTest-958305973-project-member] validate_networks() for [('66d45708-1484-4ec8-861d-2a9d71cc55e3', None, None, None, None, None)] {{(pid=183467) validate_networks /opt/stack/new/nova/nova/network/neutron.py:2623}}

process stuck, no request in neutron side.

https://c70dbec6f87bfb5341ea-ea0559e743d7a23d852ba8603145309b.ssl.cf5.rackcdn.com/882961/1/check/neutron-ovs-tempest-multinode-full/fc530a0/

May 11 16:01:11.043123 np0034011311 devstack@n-api.service[72259]: DEBUG nova.api.openstack.wsgi [None req-f2c30d3c-8ece-4a5b-87bc-e11a04f711c0 tempest-ServerRescueTestJSON-2114237170 tempest-ServerRescueTestJSON-2114237170-project-member] Action: 'create', calling method: <bound method ServersController.create of <nova.api.openstack.compute.servers.ServersController object at 0x7f897958bf10>>, body: {"server": {"name": "tempest-ServerRescueTestJSON-server-695295933", "imageRef": "886de379-03b4-49f4-bd5a-9b2ec9d89ecf", "flavorRef": "42", "adminPass": "***", "networks": [{"uuid": "}]}} {{(pid=72259) _process_stack /opt/stack/nova/nova/api/openstack/wsgi.py:511}}

No request after this, process stuck, even no reference for validate_network

https://storage.gra.cloud.ovh.net/v1/AUTH_dcaab5e32b234d56b626f72581e3644c/zuul_opendev_logs_db0/882837/2/check/tempest-slow-py3/db0e6e5/testr_results.html

May 10 19:52:15.087792 np0034002730 devstack@n-api.service[92695]: DEBUG nova.api.openstack.wsgi [None req-782f9e96-15db-4c23-8fd6-a3cabf948b61 tempest-TestGettingAddress-1525643694 tempest-TestGettingAddress-1525643694-project-member] Action: 'create', calling method: <bound method ServersController.create of <nova.api.openstack.compute.servers.ServersController object at 0x7eff5cf166d0>>, body: {"server": {"name": "tempest-TestGettingAddress-server-1238937992", "imageRef": "9af1b3cf-3016-47d8-a5b2-9c058356a92d", "flavorRef": "42", "key_name": "tempest-TestGettingAddress-2040359880", "security_groups": [{"name": "tempest-secgroup-smoke-110387057"}], "networks": [{"uuid": "9b88202a-5cf0-4042-8771-30a935feea75"}]}} {{(pid=92695) _process_stack /opt/stack/nova/nova/api/openstack/wsgi.py:511}}
May 10 19:52:15.103501 np0034002730 devstack@n-api.service[92695]: DEBUG dbcounter [-] [92695] Writing DB stats nova_cell0:SELECT=2 {{(pid=92695) stat_writer /usr/local/lib/python3.8/dist-packages/dbcounter.py:114}}
May 10 19:52:15.104039 np0034002730 devstack@n-api.service[92695]: DEBUG dbcounter [-] [92695] Writing DB stats nova_cell0:SELECT=6 {{(pid=92695) stat_writer /usr/local/lib/python3.8/dist-packages/dbcounter.py:114}}
May 10 19:53:15.735261 np0034002730 devstack@n-api.service[92696]: DEBUG dbcounter [-] [92696] Writing DB stats nova_cell0:SELECT=2 {{(pid=92696) stat_writer /usr/local/lib/python3.8/dist-packages/dbcounter.py:114}}
May 10 19:53:15.736472 np0034002730 devstack@n-api.service[92696]: INFO oslo.messaging._drivers.impl_rabbit [-] A recoverable connection/channel error occurred, trying to reconnect: [Errno 104] Connection reset by peer
May 10 19:53:15.736616 np0034002730 devstack@n-api.service[92696]: Wed May 10 19:53:15 2023 - SIGPIPE: writing to a closed pipe/socket/fd (probably the client disconnected) on request /compute/v2.1/servers/81e7569e-a04b-4a08-b8df-586f31875d59 (ip 10.209.39.97) !!!

No request after this, process stuck, even no reference for validate_network

[4]
https://storage.gra.cloud.ovh.net/v1/AUTH_dcaab5e32b234d56b626f72581e3644c/zuul_opendev_logs_508/882228/1/check/neutron-tempest-dvr-ha-multinode-full/50881a4/

May 04 13:59:32.802319 np0033943785 devstack@n-api.service[64110]: DEBUG nova.api.openstack.wsgi [None req-f224b784-f580-4d8d-99d6-90a47d853254 tempest-LiveAutoBlockMigrationV225Test-478407913 tempest-LiveAutoBlockMigrationV225Test-478407913-project-admin] Action: 'action', calling method: <bound method MigrateServerController._migrate_live of <nova.api.openstack.compute.migrate_server.MigrateServerController object at 0x7fdb6fa0d700>>, body: {"os-migrateLive": {"host": "np0033943785", "block_migration": "auto"}} {{(pid=64110) _process_stack /opt/stack/nova/nova/api/openstack/wsgi.py:511}}

May 04 13:59:32.802319 np0033943785 devstack@n-api.service[64110]: DEBUG nova.compute.api [None req-f224b784-f580-4d8d-99d6-90a47d853254 tempest-LiveAutoBlockMigrationV225Test-478407913 tempest-LiveAutoBlockMigrationV225Test-478407913-project-admin] [instance: f24e7ee8-2888-485c-837d-ffc0424b27a8] Fetching instance by UUID {{(pid=64110) get /opt/stack/nova/nova/compute/api.py:2810}}
May 04 13:59:32.807675 np0033943785 devstack@n-api.service[64110]: DEBUG oslo_concurrency.lockutils [None req-f224b784-f580-4d8d-99d6-90a47d853254 tempest-LiveAutoBlockMigrationV225Test-478407913 tempest-LiveAutoBlockMigrationV225Test-478407913-project-admin] Lock "f308be1d-53ef-43e8-8ec4-f3cd983be74c" acquired by "nova.context.set_target_cell.<locals>.get_or_set_cached_cell_and_set_connections" :: waited 0.000s {{(pid=64110) inner /usr/local/lib/python3.8/dist-packages/oslo_concurrency/lockutils.py:355}}
May 04 13:59:32.808404 np0033943785 devstack@n-api.service[64110]: DEBUG oslo_concurrency.lockutils [None req-f224b784-f580-4d8d-99d6-90a47d853254 tempest-LiveAutoBlockMigrationV225Test-478407913 tempest-LiveAutoBlockMigrationV225Test-478407913-project-admin] Lock "f308be1d-53ef-43e8-8ec4-f3cd983be74c" released by "nova.context.set_target_cell.<locals>.get_or_set_cached_cell_and_set_connections" :: held 0.001s {{(pid=64110) inner /usr/local/lib/python3.8/dist-packages/oslo_concurrency/lockutils.py:367}}
May 04 13:59:32.809586 np0033943785 devstack@n-api.service[64110]: DEBUG oslo_concurrency.lockutils [None req-f224b784-f580-4d8d-99d6-90a47d853254 tempest-LiveAutoBlockMigrationV225Test-478407913 tempest-LiveAutoBlockMigrationV225Test-478407913-project-admin] Lock "f308be1d-53ef-43e8-8ec4-f3cd983be74c" acquired by "nova.context.set_target_cell.<locals>.get_or_set_cached_cell_and_set_connections" :: waited 0.000s {{(pid=64110) inner /usr/local/lib/python3.8/dist-packages/oslo_concurrency/lockutils.py:355}}
May 04 13:59:32.810196 np0033943785 devstack@n-api.service[64110]: DEBUG oslo_concurrency.lockutils [None req-f224b784-f580-4d8d-99d6-90a47d853254 tempest-LiveAutoBlockMigrationV225Test-478407913 tempest-LiveAutoBlockMigrationV225Test-478407913-project-admin] Lock "f308be1d-53ef-43e8-8ec4-f3cd983be74c" released by "nova.context.set_target_cell.<locals>.get_or_set_cached_cell_and_set_connections" :: held 0.001s {{(pid=64110) inner /usr/local/lib/python3.8/dist-packages/oslo_concurrency/lockutils.py:367}}
May 04 13:59:33.135126 np0033943785 devstack@n-api.service[64110]: DEBUG nova.objects.instance [None req-f224b784-f580-4d8d-99d6-90a47d853254 tempest-LiveAutoBlockMigrationV225Test-478407913 tempest-LiveAutoBlockMigrationV225Test-478407913-project-admin] Lazy-loading 'flavor' on Instance uuid f24e7ee8-2888-485c-837d-ffc0424b27a8 {{(pid=64110) obj_load_attr /opt/stack/nova/nova/objects/instance.py:1098}}
May 04 13:59:33.135126 np0033943785 devstack@n-api.service[64110]: DEBUG nova.compute.api [None req-f224b784-f580-4d8d-99d6-90a47d853254 tempest-LiveAutoBlockMigrationV225Test-478407913 tempest-LiveAutoBlockMigrationV225Test-478407913-project-admin] [instance: f24e7ee8-2888-485c-837d-ffc0424b27a8] Going to try to live migrate instance to np0033943785 {{(pid=64110) live_migrate /opt/stack/nova/nova/compute/api.py:5062}}

May 04 14:00:31.643019 np0033943785 devstack@n-api.service[64110]: INFO nova.api.openstack.requestlog [None req-f224b784-f580-4d8d-99d6-90a47d853254 tempest-LiveAutoBlockMigrationV225Test-478407913 tempest-LiveAutoBlockMigrationV225Test-478407913-project-admin] 10.210.192.189 "POST /compute/v2.1/servers/f24e7ee8-2888-485c-837d-ffc0424b27a8/action" status: 202 len: 0 microversion: 2.25 time: 58.855764

https://9fdc574de21b99c6bda7-6e2dfe610262db0cf157ed36bc183b08.ssl.cf1.rackcdn.com/882876/3/check/glance-multistore-cinder-import/f08cc71/testr_results.html

image = _images_client.create_image(server['id'], name=name, **kwargs)
  File "/opt/stack/tempest/tempest/lib/services/compute/images_client.py", line 43, in create_image
    resp, body = self.post('servers/%s/action' % server_id,
  File "/opt/stack/tempest/tempest/lib/common/rest_client.py", line 300, in post

May 15 19:15:41.404789 np0034034002 devstack@n-api.service[73695]: INFO nova.api.openstack.requestlog [None req-8f56ce2c-b49b-4eb9-a429-09bc29c620c9 tempest-TestVolumeBootPattern-771698819 tempest-TestVolumeBootPattern-771698819-project-member] 10.209.98.236 "POST /compute/v2.1/servers/1fbb58b0-642b-40e6-8710-1ddeb275ea5b/action" status: 202 len: 0 microversion: 2.1 time: 68.105126

May 15 19:15:41.096997 np0034034002 devstack@g-api.service[86150]: [pid: 86150|app: 0|req: 1329/2510] 127.0.0.1 () {46 vars in 1166 bytes} [Mon May 15 19:14:35 2023] PUT /v2/images/6660ff94-2cc1-4ad2-b327-bfd3f26d2b8b/file => generated 0 bytes in 65403 msecs (HTTP/1.1 204) 4 headers in 171 bytes (1 switches on core 0)

similar https://228c6d1e1375d79620ec-81cca3af3708a397e1737059c1c785ff.ssl.cf1.rackcdn.com/802180/6/check/glance-multistore-cinder-import/dbd1a65/controller/logs/screen-n-api.txt

May 15 14:55:18.476619 np0034032608 devstack@n-api.service[73450]: INFO nova.api.openstack.requestlog [None req-1744d682-789e-4dde-a7cb-36bd5c2a31d0 tempest-TestVolumeBootPattern-1564721298 tempest-TestVolumeBootPattern-1564721298-project-member] 10.209.98.226 "POST /compute/v2.1/servers/15290b69-160b-498e-a889-222c213cf4b3/action" status: 202 len: 0 microversion: 2.1 time: 74.656768

http://storage.bhs.cloud.ovh.net/v1/AUTH_dcaab5e32b234d56b626f72581e3644c/zuul_opendev_logs_e9b/877334/2/check/tempest-integrated-compute-centos-8-stream/e9bab29/testr_results.html

May 12 05:22:11.992686 np0034017474 devstack@n-api.service[91506]: INFO nova.api.openstack.wsgi [None req-bc98ab10-4a0f-49ee-a840-5b80e8d35feb tempest-TaggedAttachmentsTest-166319850 tempest-TaggedAttachmentsTest-166319850-project] HTTP exception thrown: Unexpected API Error. Please report this at http://bugs.launchpad.net/nova/ and attach the Nova API log if possible.
May 12 05:22:11.992838 np0034017474 devstack@n-api.service[91506]: <class 'oslo_messaging.exceptions.MessagingTimeout'>
May 12 05:22:12.005745 np0034017474 devstack@n-api.service[91506]: DEBUG nova.api.openstack.wsgi [None req-bc98ab10-4a0f-49ee-a840-5b80e8d35feb tempest-TaggedAttachmentsTest-166319850 tempest-TaggedAttachmentsTest-166319850-project] Returning 500 to user: Unexpected API Error. Please report this at http://bugs.launchpad.net/nova/ and attach the Nova API log if possible.
May 12 05:22:12.005745 np0034017474 devstack@n-api.service[91506]: <class 'oslo_messaging.exceptions.MessagingTimeout'> {{(pid=91506) __call__ /opt/stack/nova/nova/api/openstack/wsgi.py:928}}
May 12 05:22:12.011955 np0034017474 devstack@n-api.service[91506]: INFO nova.api.openstack.requestlog [None req-bc98ab10-4a0f-49ee-a840-5b80e8d35feb tempest-TaggedAttachmentsTest-166319850 tempest-TaggedAttachmentsTest-166319850-project] 173.231.255.74 "POST /compute/v2.1/servers/6b2e20ab-2626-4ca0-ad91-8f059f427d27/os-interface" status: 500 len: 216 microversion: 2.49 time: 60.324819

May 12 05:22:12.014193 np0034017474 devstack@n-api.service[91506]: Fri May 12 05:22:12 2023 - SIGPIPE: writing to a closed pipe/socket/fd (probably the client disconnected) on request /compute/v2.1/servers/6b2e20ab-2626-4ca0-ad91-8f059f427d27/os-interface (ip 173.231.255.74) !!!
May 12 05:22:12.025057 np0034017474 devstack@n-api.service[91506]: Fri May 12 05:22:12 2023 - uwsgi_response_writev_headers_and_body_do(): Broken pipe [core/writer.c line 306] during POST /compute/v2.1/servers/6b2e20ab-2626-4ca0-ad91-8f059f427d27/os-interface (173.231.255.74)
May 12 05:22:12.040106 np0034017474 devstack@n-api.service[91506]: CRITICAL nova [None req-bc98ab10-4a0f-49ee-a840-5b80e8d35feb tempest-TaggedAttachmentsTest-166319850 tempest-TaggedAttachmentsTest-166319850-project] Unhandled error: OSError: write error
May 12 05:22:12.040106 np0034017474 devstack@n-api.service[91506]: ERROR nova OSError: write error
May 12 05:22:12.040106 np0034017474 devstack@n-api.service[91506]: ERROR nova 
May 12 05:22:12.040381 np0034017474 devstack@n-api.service[91506]: [pid: 91506|app: 0|req: 1615/3448] 173.231.255.74 () {72 vars in 1480 bytes} [Fri May 12 05:21:11 2023] POST /compute/v2.1/servers/6b2e20ab-2626-4ca0-ad91-8f059f427d27/os-interface => generated 0 bytes in 60354 msecs (HTTP/1.1 500) 9 headers in 0 bytes (0 switches on core 0)

May 12 05:22:11.867211 np0034017474 devstack@n-api.service[91506]: ERROR nova.api.openstack.wsgi [None req-bc98ab10-4a0f-49ee-a840-5b80e8d35feb tempest-TaggedAttachmentsTest-166319850 tempest-TaggedAttachmentsTest-166319850-project] Unexpected exception in API method: oslo_messaging.exceptions.MessagingTimeout: Timed out waiting for a reply to message ID 9842a2c6ba07443b98889a3eecd79ab8
May 12 05:22:11.867211 np0034017474 devstack@n-api.service[91506]: ERROR nova.api.openstack.wsgi Traceback (most recent call last):
May 12 05:22:11.867211 np0034017474 devstack@n-api.service[91506]: ERROR nova.api.openstack.wsgi   File "/usr/local/lib/python3.6/site-packages/oslo_messaging/_drivers/amqpdriver.py", line 433, in get
May 12 05:22:11.867211 np0034017474 devstack@n-api.service[91506]: ERROR nova.api.openstack.wsgi     return self._queues[msg_id].get(block=True, timeout=timeout)
May 12 05:22:11.867211 np0034017474 devstack@n-api.service[91506]: ERROR nova.api.openstack.wsgi   File "/usr/local/lib/python3.6/site-packages/eventlet/queue.py", line 322, in get
May 12 05:22:11.867211 np0034017474 devstack@n-api.service[91506]: ERROR nova.api.openstack.wsgi     return waiter.wait()
May 12 05:22:11.867211 np0034017474 devstack@n-api.service[91506]: ERROR nova.api.openstack.wsgi   File "/usr/local/lib/python3.6/site-packages/eventlet/queue.py", line 141, in wait
May 12 05:22:11.867211 np0034017474 devstack@n-api.service[91506]: ERROR nova.api.openstack.wsgi     return get_hub().switch()
May 12 05:22:11.867211 np0034017474 devstack@n-api.service[91506]: ERROR nova.api.openstack.wsgi   File "/usr/local/lib/python3.6/site-packages/eventlet/hubs/hub.py", line 313, in switch
May 12 05:22:11.867211 np0034017474 devstack@n-api.service[91506]: ERROR nova.api.openstack.wsgi     return self.greenlet.switch()
May 12 05:22:11.867211 np0034017474 devstack@n-api.service[91506]: ERROR nova.api.openstack.wsgi queue.Empty
May 12 05:22:11.867211 np0034017474 devstack@n-api.service[91506]: ERROR nova.api.openstack.wsgi 
May 12 05:22:11.867211 np0034017474 devstack@n-api.service[91506]: ERROR nova.api.openstack.wsgi During handling of the above exception, another exception occurred:
May 12 05:22:11.867211 np0034017474 devstack@n-api.service[91506]: ERROR nova.api.openstack.wsgi 
May 12 05:22:11.867211 np0034017474 devstack@n-api.service[91506]: ERROR nova.api.openstack.wsgi Traceback (most recent call last):
May 12 05:22:11.867211 np0034017474 devstack@n-api.service[91506]: ERROR nova.api.openstack.wsgi   File "/opt/stack/nova/nova/api/openstack/wsgi.py", line 658, in wrapped
May 12 05:22:11.867211 np0034017474 devstack@n-api.service[91506]: ERROR nova.api.openstack.wsgi     return f(*args, **kwargs)
May 12 05:22:11.867211 np0034017474 devstack@n-api.service[91506]: ERROR nova.api.openstack.wsgi   File "/opt/stack/nova/nova/api/validation/__init__.py", line 110, in wrapper
May 12 05:22:11.867211 np0034017474 devstack@n-api.service[91506]: ERROR nova.api.openstack.wsgi     return func(*args, **kwargs)
May 12 05:22:11.867211 np0034017474 devstack@n-api.service[91506]: ERROR nova.api.openstack.wsgi   File "/opt/stack/nova/nova/api/validation/__init__.py", line 110, in wrapper
May 12 05:22:11.867211 np0034017474 devstack@n-api.service[91506]: ERROR nova.api.openstack.wsgi     return func(*args, **kwargs)
May 12 05:22:11.867211 np0034017474 devstack@n-api.service[91506]: ERROR nova.api.openstack.wsgi   File "/opt/stack/nova/nova/api/openstack/compute/attach_interfaces.py", line 169, in create
May 12 05:22:11.867211 np0034017474 devstack@n-api.service[91506]: ERROR nova.api.openstack.wsgi     instance, network_id, port_id, req_ip, tag=tag)
May 12 05:22:11.867211 np0034017474 devstack@n-api.service[91506]: ERROR nova.api.openstack.wsgi   File "/opt/stack/nova/nova/compute/api.py", line 226, in inner
May 12 05:22:11.867211 np0034017474 devstack@n-api.service[91506]: ERROR nova.api.openstack.wsgi     return function(self, context, instance, *args, **kwargs)
May 12 05:22:11.867211 np0034017474 devstack@n-api.service[91506]: ERROR nova.api.openstack.wsgi   File "/opt/stack/nova/nova/compute/api.py", line 153, in inner
May 12 05:22:11.867211 np0034017474 devstack@n-api.service[91506]: ERROR nova.api.openstack.wsgi     return f(self, context, instance, *args, **kw)
May 12 05:22:11.867211 np0034017474 devstack@n-api.service[91506]: ERROR nova.api.openstack.wsgi   File "/opt/stack/nova/nova/compute/api.py", line 5072, in attach_interface
May 12 05:22:11.867211 np0034017474 devstack@n-api.service[91506]: ERROR nova.api.openstack.wsgi     requested_ip=requested_ip, tag=tag)
May 12 05:22:11.867211 np0034017474 devstack@n-api.service[91506]: ERROR nova.api.openstack.wsgi   File "/opt/stack/nova/nova/compute/rpcapi.py", line 570, in attach_interface
May 12 05:22:11.867211 np0034017474 devstack@n-api.service[91506]: ERROR nova.api.openstack.wsgi     return cctxt.call(ctxt, 'attach_interface', **kw)
May 12 05:22:11.867211 np0034017474 devstack@n-api.service[91506]: ERROR nova.api.openstack.wsgi   File "/usr/local/lib/python3.6/site-packages/oslo_messaging/rpc/client.py", line 179, in call
May 12 05:22:11.867211 np0034017474 devstack@n-api.service[91506]: ERROR nova.api.openstack.wsgi     transport_options=self.transport_options)
May 12 05:22:11.867211 np0034017474 devstack@n-api.service[91506]: ERROR nova.api.openstack.wsgi   File "/usr/local/lib/python3.6/site-packages/oslo_messaging/transport.py", line 128, in _send
May 12 05:22:11.867211 np0034017474 devstack@n-api.service[91506]: ERROR nova.api.openstack.wsgi     transport_options=transport_options)
May 12 05:22:11.867211 np0034017474 devstack@n-api.service[91506]: ERROR nova.api.openstack.wsgi   File "/usr/local/lib/python3.6/site-packages/oslo_messaging/_drivers/amqpdriver.py", line 683, in send
May 12 05:22:11.867211 np0034017474 devstack@n-api.service[91506]: ERROR nova.api.openstack.wsgi     transport_options=transport_options)
May 12 05:22:11.867211 np0034017474 devstack@n-api.service[91506]: ERROR nova.api.openstack.wsgi   File "/usr/local/lib/python3.6/site-packages/oslo_messaging/_drivers/amqpdriver.py", line 671, in _send
May 12 05:22:11.867211 np0034017474 devstack@n-api.service[91506]: ERROR nova.api.openstack.wsgi     call_monitor_timeout)
May 12 05:22:11.867211 np0034017474 devstack@n-api.service[91506]: ERROR nova.api.openstack.wsgi   File "/usr/local/lib/python3.6/site-packages/oslo_messaging/_drivers/amqpdriver.py", line 559, in wait
May 12 05:22:11.867211 np0034017474 devstack@n-api.service[91506]: ERROR nova.api.openstack.wsgi     message = self.waiters.get(msg_id, timeout=timeout)
May 12 05:22:11.867211 np0034017474 devstack@n-api.service[91506]: ERROR nova.api.openstack.wsgi   File "/usr/local/lib/python3.6/site-packages/oslo_messaging/_drivers/amqpdriver.py", line 437, in get
May 12 05:22:11.867211 np0034017474 devstack@n-api.service[91506]: ERROR nova.api.openstack.wsgi     'to message ID %s' % msg_id)
May 12 05:22:11.867211 np0034017474 devstack@n-api.service[91506]: ERROR nova.api.openstack.wsgi oslo_messaging.exceptions.MessagingTimeout: Timed out waiting for a reply to message ID 9842a2c6ba07443b98889a3eecd79ab8
May 12 05:22:11.867211 np0034017474 devstack@n-api.service[91506]: ERROR nova.api.openstack.wsgi 
May 12 05:22:11.891816 np0034017474 devstack@n-api.service[91505]: DEBUG nova.api.openstack.wsgi [None req-aa8e3d53-66a4-444a-b49a-3cec0488b1d7 tempest-TaggedAttachmentsTest-166319850 tempest-TaggedAttachmentsTest-166319850-project] Calling method '<bound method ServersController.delete of <nova.api.openstack.compute.servers.ServersController object at 0x7f7d4d15c358>>' {{(pid=91505) _process_stack /opt/stack/nova/nova/api/openstack/wsgi.py:514}}

https://storage.bhs.cloud.ovh.net/v1/AUTH_dcaab5e32b234d56b626f72581e3644c/zuul_opendev_logs_edd/882414/4/check/neutron-ovn-rally-task/edd0a2c/job-output.txt

2023-05-11 15:09:16.057343 | controller | 2023-05-11 15:09:16.054 100008 ERROR rally.task.runner Traceback (most recent call last):
2023-05-11 15:09:16.057354 | controller | 2023-05-11 15:09:16.054 100008 ERROR rally.task.runner   File "/usr/local/lib/python3.10/dist-packages/rally/task/runner.py", line 69, in _run_scenario_once
2023-05-11 15:09:16.057365 | controller | 2023-05-11 15:09:16.054 100008 ERROR rally.task.runner     getattr(scenario_inst, method_name)(**scenario_kwargs)
2023-05-11 15:09:16.057376 | controller | 2023-05-11 15:09:16.054 100008 ERROR rally.task.runner   File "/opt/stack/rally-openstack/rally_openstack/task/scenarios/neutron/trunk.py", line 56, in run
2023-05-11 15:09:16.057390 | controller | 2023-05-11 15:09:16.054 100008 ERROR rally.task.runner     trunk = self._create_trunk(trunk_payload)
2023-05-11 15:09:16.057401 | controller | 2023-05-11 15:09:16.054 100008 ERROR rally.task.runner   File "/usr/local/lib/python3.10/dist-packages/rally/task/atomic.py", line 91, in func_atomic_actions
2023-05-11 15:09:16.057412 | controller | 2023-05-11 15:09:16.054 100008 ERROR rally.task.runner     f = func(self, *args, **kwargs)
2023-05-11 15:09:16.057423 | controller | 2023-05-11 15:09:16.054 100008 ERROR rally.task.runner   File "/opt/stack/rally-openstack/rally_openstack/task/scenarios/neutron/utils.py", line 873, in _create_trunk
2023-05-11 15:09:16.057434 | controller | 2023-05-11 15:09:16.054 100008 ERROR rally.task.runner     return self.clients("neutron").create_trunk({"trunk": trunk_payload})
2023-05-11 15:09:16.057445 | controller | 2023-05-11 15:09:16.054 100008 ERROR rally.task.runner   File "/usr/local/lib/python3.10/dist-packages/neutronclient/v2_0/client.py", line 2259, in create_trunk
2023-05-11 15:09:16.057456 | controller | 2023-05-11 15:09:16.054 100008 ERROR rally.task.runner     return self.post(self.trunks_path, body=body)
2023-05-11 15:09:16.057467 | controller | 2023-05-11 15:09:16.054 100008 ERROR rally.task.runner   File "/usr/local/lib/python3.10/dist-packages/neutronclient/v2_0/client.py", line 361, in post
2023-05-11 15:09:16.057484 | controller | 2023-05-11 15:09:16.054 100008 ERROR rally.task.runner     return self.do_request("POST", action, body=body,
2023-05-11 15:09:16.057496 | controller | 2023-05-11 15:09:16.054 100008 ERROR rally.task.runner   File "/usr/local/lib/python3.10/dist-packages/neutronclient/v2_0/client.py", line 284, in do_request
2023-05-11 15:09:16.057512 | controller | 2023-05-11 15:09:16.054 100008 ERROR rally.task.runner     resp, replybody = self.httpclient.do_request(action, method, body=body,
2023-05-11 15:09:16.057525 | controller | 2023-05-11 15:09:16.054 100008 ERROR rally.task.runner   File "/usr/local/lib/python3.10/dist-packages/neutronclient/client.py", line 342, in do_request
2023-05-11 15:09:16.057535 | controller | 2023-05-11 15:09:16.054 100008 ERROR rally.task.runner     return self.request(url, method, **kwargs)
2023-05-11 15:09:16.057547 | controller | 2023-05-11 15:09:16.054 100008 ERROR rally.task.runner   File "/usr/local/lib/python3.10/dist-packages/neutronclient/client.py", line 330, in request
2023-05-11 15:09:16.057558 | controller | 2023-05-11 15:09:16.054 100008 ERROR rally.task.runner     resp = super(SessionClient, self).request(*args, **kwargs)
2023-05-11 15:09:16.057569 | controller | 2023-05-11 15:09:16.054 100008 ERROR rally.task.runner   File "/usr/local/lib/python3.10/dist-packages/keystoneauth1/adapter.py", line 257, in request
2023-05-11 15:09:16.057580 | controller | 2023-05-11 15:09:16.054 100008 ERROR rally.task.runner     return self.session.request(url, method, **kwargs)
2023-05-11 15:09:16.057591 | controller | 2023-05-11 15:09:16.054 100008 ERROR rally.task.runner   File "/usr/local/lib/python3.10/dist-packages/keystoneauth1/session.py", line 931, in request
2023-05-11 15:09:16.057602 | controller | 2023-05-11 15:09:16.054 100008 ERROR rally.task.runner     resp = send(**kwargs)
2023-05-11 15:09:16.057613 | controller | 2023-05-11 15:09:16.054 100008 ERROR rally.task.runner   File "/usr/local/lib/python3.10/dist-packages/keystoneauth1/session.py", line 1029, in _send_request
2023-05-11 15:09:16.057624 | controller | 2023-05-11 15:09:16.054 100008 ERROR rally.task.runner     raise exceptions.ConnectTimeout(msg)
2023-05-11 15:09:16.057635 | controller | 2023-05-11 15:09:16.054 100008 ERROR rally.task.runner keystoneauth1.exceptions.connection.ConnectTimeout: Request to https://10.209.99.49:9696/networking/v2.0/trunks timed out

operations took more than 180 seconds
time: 182.0800927
time: 182.4202223

https://storage.bhs.cloud.ovh.net/v1/AUTH_dcaab5e32b234d56b626f72581e3644c/zuul_opendev_logs_e43/882860/1/check/nova-lvm/e433a56/testr_results.html

Returning 409 to user: Cannot 'createImage' instance 01bcf602-f527-4344-9edf-987036afd897 while it is in vm_state deleted {{(pid=90696) __call__ /opt/stack/nova/nova/api/openstack/wsgi.py:942}}
May 11 15:28:54.173720 np0034011077 devstack@n-api.service[90696]: INFO nova.api.openstack.requestlog [None req-ff3107ed-8140-47ac-b994-e43ca032c6d6 tempest-ServerStableDeviceRescueTest-1740085696 tempest-ServerStableDeviceRescueTest-1740085696-project-member] 10.209.99.30 "POST /compute/v2.1/servers/01bcf602-f527-4344-9edf-987036afd897/action" status: 409 len: 150 microversion: 2.1 time: 171.922542

https://storage.gra.cloud.ovh.net/v1/AUTH_dcaab5e32b234d56b626f72581e3644c/zuul_opendev_logs_fac/882935/2/check/neutron-tempest-plugin-openvswitch-enforce-scope-new-defaults/fac66e2/testr_results.html

May 11 14:09:57.883139 np0034010650 devstack@n-api.service[56320]: DEBUG nova.api.openstack.wsgi [None req-9b29babb-e449-4e1f-93bd-6fb5b1c6c842 tempest-AttachInterfacesTestJSON-1186478134 tempest-AttachInterfacesTestJSON-1186478134-project-member] Action: 'create', calling method: <bound method InterfaceAttachmentController.create of <nova.api.openstack.compute.attach_interfaces.InterfaceAttachmentController object at 0x7fdfde08bfa0>>, body: {"interfaceAttachment": {"net_id": "2b9fc060-ddc2-4a6f-8d00-c30d07ce4bc1", "fixed_ips": [{"ip_address": "10.1.0.14"}]}} {{(pid=56320) _process_stack /opt/stack/nova/nova/api/openstack/wsgi.py:511}}
May 11 14:09:57.888074 np0034010650 devstack@n-api.service[56320]: DEBUG nova.compute.api [None req-9b29babb-e449-4e1f-93bd-6fb5b1c6c842 tempest-AttachInterfacesTestJSON-1186478134 tempest-AttachInterfacesTestJSON-1186478134-project-member] [instance: 9a46554a-1c39-4f25-a89d-0f798d412e21] Fetching instance by UUID {{(pid=56320) get /opt/stack/nova/nova/compute/api.py:2977}}

May 11 14:11:03.484874 np0034010650 devstack@n-api.service[56320]: INFO nova.api.openstack.wsgi [None req-9b29babb-e449-4e1f-93bd-6fb5b1c6c842 tempest-AttachInterfacesTestJSON-1186478134 tempest-AttachInterfacesTestJSON-1186478134-project-member] HTTP exception thrown: Failed to attach network adapter device to 9a46554a-1c39-4f25-a89d-0f798d412e21
May 11 14:11:03.484874 np0034010650 devstack@n-api.service[56320]: DEBUG nova.api.openstack.wsgi [None req-9b29babb-e449-4e1f-93bd-6fb5b1c6c842 tempest-AttachInterfacesTestJSON-1186478134 tempest-AttachInterfacesTestJSON-1186478134-project-member] Returning 500 to user: Failed to attach network adapter device to 9a46554a-1c39-4f25-a89d-0f798d412e21 {{(pid=56320) __call__ /opt/stack/nova/nova/api/openstack/wsgi.py:936}}

May 11 14:11:03.484874 np0034010650 devstack@n-api.service[56320]: INFO nova.api.openstack.requestlog [None req-9b29babb-e449-4e1f-93bd-6fb5b1c6c842 tempest-AttachInterfacesTestJSON-1186478134 tempest-AttachInterfacesTestJSON-1186478134-project-member] 149.202.186.223 "POST /compute/v2.1/servers/9a46554a-1c39-4f25-a89d-0f798d412e21/os-interface" status: 500 len: 125 microversion: 2.1 time: 65.601525
May 11 14:11:03.484874 np0034010650 devstack@n-api.service[56320]: [pid: 56320|app: 0|req: 59/163] 149.202.186.223 () {68 vars in 1473 bytes} [Thu May 11 14:09:57 2023] POST /compute/v2.1/servers/9a46554a-1c39-4f25-a89d-0f798d412e21/os-interface => generated 125 bytes in 65606 msecs (HTTP/1.1 500) 9 headers in 391 bytes (1 switches on core 0)

https://79147624d07c585f06bc-4dac856ed679b1298896f008f0cb0c79.ssl.cf1.rackcdn.com/882890/2/gate/nova-next/38a9a0a/testr_results.html

May 11 05:11:43.894126 np0034006954 devstack@c-api.service[83610]: INFO cinder.api.openstack.wsgi [None req-78ac74da-e440-4f70-8cbb-ab533eac5523 tempest-AttachVolumeShelveTestJSON-864822792 tempest-AttachVolumeShelveTestJSON-864822792-project-member] https://158.69.80.6/volume/v3/a21991cdda6d4601aaed26bf96283d2c/volumes returned with HTTP 202
May 11 05:11:43.895587 np0034006954 devstack@c-api.service[83610]: [pid: 83610|app: 0|req: 636/1239] 158.69.80.6 () {68 vars in 1382 bytes} [Thu May 11 05:10:42 2023] POST /volume/v3/a21991cdda6d4601aaed26bf96283d2c/volumes => generated 827 bytes in 61586 msecs (HTTP/1.1 202) 7 headers in 291 bytes (1 switches on core 0)

https://aca43ed1a01d95dea0ee-dcdb6cbb330bdac08ffee1284f86c919.ssl.cf2.rackcdn.com/882836/2/check/tempest-slow-py3/f609bd3/testr_results.html

May 10 19:13:25.124295 np0034002701 neutron-server[78263]: INFO neutron.wsgi [None req-21bf0172-34aa-4163-bdda-905e20b47a9a tempest-TestSecurityGroupsBasicOps-1362401184 tempest-TestSecurityGroupsBasicOps-1362401184-project-member] 158.69.77.166,158.69.77.166 "POST /networking/v2.0/routers HTTP/1.1" status: 201  len: 966 time: 60.8968260

https://900eccf1f4bff0b59f52-73c283b41a72e11d0d45b2b35bf04bcb.ssl.cf2.rackcdn.com/882498/5/check/tempest-ipv6-only/73f390e/testr_results.html

May 08 09:33:38.416020 np0033975311 neutron-server[76744]: DEBUG neutron.api.v2.base [None req-03d9128f-2272-4c18-b77e-8ba1d740b4ee admin admin] Request body: {'network': {'name': 'tempest-AttachInterfacesUnderV243Test-944803836-network', 'tenant_id': '7af4f4dab815446295276fec5f5c7e2c'}} {{(pid=76744) prepare_request_body /opt/stack/neutron/neutron/api/v2/base.py:731}}
May 08 09:33:38.457306 np0033975311 neutron-server[76740]: DEBUG neutron.api.v2.base [None req-62961367-27ec-407f-a7f9-e5ba83b7b873 admin admin] Request body: {'network': {'name': 'tempest-ServersTestJSON-1918134297-network', 'tenant_id': 'd5edf1ee9c9a4132bb9f46077b29d232'}} {{(pid=76740) prepare_request_body /opt/stack/neutron/neutron/api/v2/base.py:731}}
May 08 09:33:38.490336 np0033975311 neutron-server[76740]: DEBUG neutron.api.v2.base [None req-a19c9bf6-3ff1-4e75-a0f8-87da8ee73bc6 admin admin] Request body: {'network': {'name': 'tempest-ServersTestJSON-1812175452-network', 'tenant_id': 'f2df5fdec93d444f9af0c7a8df125a43'}} {{(pid=76740) prepare_request_body /opt/stack/neutron/neutron/api/v2/base.py:731}}

May 08 09:34:45.782587 np0033975311 neutron-server[76744]: INFO neutron.wsgi [None req-03d9128f-2272-4c18-b77e-8ba1d740b4ee admin admin] 2001:4801:7828:101:be76:4eff:fe10:24df,2001:4801:7828:101:be76:4eff:fe10:24df "POST /networking/v2.0/networks HTTP/1.1" status: 201  len: 895 time: 67.5398960
May 08 09:34:45.783251 np0033975311 neutron-server[76740]: INFO neutron.wsgi [None req-62961367-27ec-407f-a7f9-e5ba83b7b873 admin admin] 2001:4801:7828:101:be76:4eff:fe10:24df,2001:4801:7828:101:be76:4eff:fe10:24df "POST /networking/v2.0/networks HTTP/1.1" status: 201  len: 882 time: 67.5201225
May 08 09:34:45.786188 np0033975311 neutron-server[76740]: INFO neutron.wsgi [None req-a19c9bf6-3ff1-4e75-a0f8-87da8ee73bc6 admin admin] 2001:4801:7828:101:be76:4eff:fe10:24df,2001:4801:7828:101:be76:4eff:fe10:24df "POST /networking/v2.0/networks HTTP/1.1" status: 201  len: 882 time: 67.5010827

http://storage.gra.cloud.ovh.net/v1/AUTH_dcaab5e32b234d56b626f72581e3644c/zuul_opendev_logs_236/878543/17/check/neutron-ovn-tempest-ipv6-only-ovs-release/2363670/testr_results.html

May 03 08:39:10.990597 np0033927156 devstack@n-api.service[54171]: INFO nova.api.openstack.requestlog [None req-55a45344-93eb-4a03-9f92-34c0eb10afaf tempest-AttachInterfacesTestJSON-1166285848 tempest-AttachInterfacesTestJSON-1166285848-project-member] 2001:4801:7828:101:be76:4eff:fe10:1af7 "POST /compute/v2.1/servers/8002750b-87d8-4921-972c-9d9442c66b1f/os-interface" status: 500 len: 216 microversion: 2.1 time: 62.146222
May 03 08:39:10.990597 np0033927156 devstack@n-api.service[54171]: [pid: 54171|app: 0|req: 957/1931] 2001:4801:7828:101:be76:4eff:fe10:1af7 () {68 vars in 1630 bytes} [Wed May  3 08:38:08 2023] POST /compute/v2.1/servers/8002750b-87d8-4921-972c-9d9442c66b1f/os-interface => generated 216 bytes in 62148 msecs (HTTP/1.1 500) 9 headers in 391 bytes (1 switches on core 0)

May 03 08:39:50.029126 np0033927156 devstack@n-api.service[54172]: INFO nova.api.openstack.requestlog [None req-822b63f1-8524-4998-b321-0f790015a2c4 tempest-ServerRescueTestJSONUnderV235-539838525 tempest-ServerRescueTestJSONUnderV235-539838525-project-member] 2001:4801:7828:101:be76:4eff:fe10:1af7 "POST /compute/v2.1/servers/646d3eab-fe59-4a78-a9ce-740b583768f7/action" status: 202 len: 0 microversion: 2.1 time: 78.029476

May 03 08:39:38.998204 np0033927156 neutron-server[57790]: INFO neutron.wsgi [req-822b63f1-8524-4998-b321-0f790015a2c4 req-f3155f0a-3e50-4c39-9335-17b8dc929010 tempest-ServerRescueTestJSONUnderV235-539838525 tempest-ServerRescueTestJSONUnderV235-539838525-project-member] 2001:4801:7828:101:be76:4eff:fe10:1af7,2001:4801:7828:101:be76:4eff:fe10:1af7 "PUT /networking/v2.0/floatingips/5d503c86-884a-441d-9e35-d36c92df82c8 HTTP/1.1" status: 200  len: 1028 time: 67.8046463

[5] https://zuul.opendev.org/t/openstack/build/18ccc22292fd4f92b958be025a303b39

Revision history for this message

yatin (yatinkarel) wrote on 2023-05-19:

#8

Download full text (9.9 KiB)

<<< - Next i would like to collect gmr if that can give some hint for the issue.

Ok was able to reproduce and collect it in [1][2].

Also did multiple runs disabling dbcounter[3] but it didn't reproduce in the test patch. Since th e issue is random not sure if it's just a coincidence or dbcounter making the issue appear more frequently. We can disable in some jobs and see if it helps in reducing the occurrence as disabling it won't harm.

Stuck Thread Traceback:-
/opt/stack/new/nova/nova/api/openstack/urlmap.py:305 in __call__
`return app(environ, start_response)`

/opt/stack/new/nova/nova/api/openstack/urlmap.py:202 in wrap
`return app(environ, start_response)`

/usr/local/lib/python3.10/dist-packages/webob/dec.py:129 in __call__
`resp = self.call_func(req, *args, **kw)`

/usr/local/lib/python3.10/dist-packages/webob/dec.py:193 in call_func
`return self.func(req, *args, **kwargs)`

/usr/local/lib/python3.10/dist-packages/oslo_middleware/base.py:124 in __call__
`response = req.get_response(self.application)`

/usr/local/lib/python3.10/dist-packages/webob/request.py:1313 in send
`status, headers, app_iter = self.call_application(`

/usr/local/lib/python3.10/dist-packages/webob/request.py:1278 in call_application
`app_iter = application(self.environ, start_response)`

/usr/local/lib/python3.10/dist-packages/webob/dec.py:129 in __call__
`resp = self.call_func(req, *args, **kw)`

/usr/local/lib/python3.10/dist-packages/webob/dec.py:193 in call_func
`return self.func(req, *args, **kwargs)`

/usr/local/lib/python3.10/dist-packages/oslo_middleware/base.py:124 in __call__
`response = req.get_response(self.application)`

/usr/local/lib/python3.10/dist-packages/webob/request.py:1313 in send
`status, headers, app_iter = self.call_application(`

/usr/local/lib/python3.10/dist-packages/webob/request.py:1278 in call_application
`app_iter = application(self.environ, start_response)`

/usr/local/lib/python3.10/dist-packages/webob/dec.py:129 in __call__
`resp = self.call_func(req, *args, **kw)`

/usr/local/lib/python3.10/dist-packages/webob/dec.py:193 in call_func
`return self.func(req, *args, **kwargs)`

/usr/local/lib/python3.10/dist-packages/oslo_middleware/request_id.py:58 in __call__
`response = req.get_response(self.application)`

/usr/local/lib/python3.10/dist-packages/webob/request.py:1313 in send
`status, headers, app_iter = self.call_application(`

/usr/local/lib/python3.10/dist-packages/webob/request.py:1278 in call_application
`app_iter = application(self.environ, start_response)`

/usr/local/lib/python3.10/dist-packages/webob/dec.py:129 in __call__
`resp = self.call_func(req, *args, **kw)`

/usr/local/lib/python3.10/dist-packages/webob/dec.py:193 in call_func
`return self.func(req, *args, **kwargs)`

/opt/stack/new/nova/nova/api/openstack/__init__.py:95 in __call__
`return req.get_response(self.application)`

/usr/local/lib/python3.10/dist-packages/webob/request.py:1313 in send
`status, headers, app_iter = self.call_application(`

/usr/local/lib/python3.10/dist-packages/webob/request.py:1278 in call_application
`app_iter = application(self.environ, start_resp...

<<< - Next i would like to collect gmr if that can give some hint for the issue.

Ok was able to reproduce and collect it in [1][2].

Also did multiple runs disabling dbcounter[3] but it didn't reproduce in the test patch. Since th e issue is random not sure if it's just a coincidence or dbcounter making the issue appear more frequently. We can disable in some jobs and see if it helps in reducing the occurrence as disabling it won't harm.

Stuck Thread Traceback:-
/opt/stack/new/nova/nova/api/openstack/urlmap.py:305 in __call__
    `return app(environ, start_response)`

/opt/stack/new/nova/nova/api/openstack/urlmap.py:202 in wrap
    `return app(environ, start_response)`

/usr/local/lib/python3.10/dist-packages/webob/dec.py:129 in __call__
    `resp = self.call_func(req, *args, **kw)`

/usr/local/lib/python3.10/dist-packages/webob/dec.py:193 in call_func
    `return self.func(req, *args, **kwargs)`

/usr/local/lib/python3.10/dist-packages/oslo_middleware/base.py:124 in __call__
    `response = req.get_response(self.application)`

/usr/local/lib/python3.10/dist-packages/webob/request.py:1313 in send
    `status, headers, app_iter = self.call_application(`

/usr/local/lib/python3.10/dist-packages/webob/request.py:1278 in call_application
    `app_iter = application(self.environ, start_response)`

/usr/local/lib/python3.10/dist-packages/webob/dec.py:129 in __call__
    `resp = self.call_func(req, *args, **kw)`

/usr/local/lib/python3.10/dist-packages/webob/dec.py:193 in call_func
    `return self.func(req, *args, **kwargs)`

/usr/local/lib/python3.10/dist-packages/oslo_middleware/base.py:124 in __call__
    `response = req.get_response(self.application)`

/usr/local/lib/python3.10/dist-packages/webob/request.py:1313 in send
    `status, headers, app_iter = self.call_application(`

/usr/local/lib/python3.10/dist-packages/webob/request.py:1278 in call_application
    `app_iter = application(self.environ, start_response)`

/usr/local/lib/python3.10/dist-packages/webob/dec.py:129 in __call__
    `resp = self.call_func(req, *args, **kw)`

/usr/local/lib/python3.10/dist-packages/webob/dec.py:193 in call_func
    `return self.func(req, *args, **kwargs)`

/usr/local/lib/python3.10/dist-packages/oslo_middleware/request_id.py:58 in __call__
    `response = req.get_response(self.application)`

/usr/local/lib/python3.10/dist-packages/webob/request.py:1313 in send
    `status, headers, app_iter = self.call_application(`

/usr/local/lib/python3.10/dist-packages/webob/request.py:1278 in call_application
    `app_iter = application(self.environ, start_response)`

/usr/local/lib/python3.10/dist-packages/webob/dec.py:129 in __call__
    `resp = self.call_func(req, *args, **kw)`

/usr/local/lib/python3.10/dist-packages/webob/dec.py:193 in call_func
    `return self.func(req, *args, **kwargs)`

/opt/stack/new/nova/nova/api/openstack/__init__.py:95 in __call__
    `return req.get_response(self.application)`

/usr/local/lib/python3.10/dist-packages/webob/request.py:1313 in send
    `status, headers, app_iter = self.call_application(`

/usr/local/lib/python3.10/dist-packages/webob/request.py:1278 in call_application
    `app_iter = application(self.environ, start_response)`

/usr/local/lib/python3.10/dist-packages/webob/dec.py:129 in __call__
    `resp = self.call_func(req, *args, **kw)`

/usr/local/lib/python3.10/dist-packages/webob/dec.py:193 in call_func
    `return self.func(req, *args, **kwargs)`

/opt/stack/new/nova/nova/api/openstack/requestlog.py:99 in __call__
    `res = req.get_response(self.application)`

/usr/local/lib/python3.10/dist-packages/webob/request.py:1313 in send
    `status, headers, app_iter = self.call_application(`

/usr/local/lib/python3.10/dist-packages/webob/request.py:1278 in call_application
    `app_iter = application(self.environ, start_response)`

/usr/local/lib/python3.10/dist-packages/webob/dec.py:143 in __call__
    `return resp(environ, start_response)`

/usr/local/lib/python3.10/dist-packages/webob/dec.py:129 in __call__
    `resp = self.call_func(req, *args, **kw)`

/usr/local/lib/python3.10/dist-packages/webob/dec.py:193 in call_func
    `return self.func(req, *args, **kwargs)`

/usr/local/lib/python3.10/dist-packages/osprofiler/web.py:111 in __call__
    `return request.get_response(self.application)`

/usr/local/lib/python3.10/dist-packages/webob/request.py:1313 in send
    `status, headers, app_iter = self.call_application(`

/usr/local/lib/python3.10/dist-packages/webob/request.py:1278 in call_application
    `app_iter = application(self.environ, start_response)`

/usr/local/lib/python3.10/dist-packages/webob/dec.py:129 in __call__
    `resp = self.call_func(req, *args, **kw)`

/usr/local/lib/python3.10/dist-packages/webob/dec.py:193 in call_func
    `return self.func(req, *args, **kwargs)`

/usr/local/lib/python3.10/dist-packages/keystonemiddleware/auth_token/__init__.py:340 in __call__
    `response = req.get_response(self._app)`

/usr/local/lib/python3.10/dist-packages/webob/request.py:1313 in send
    `status, headers, app_iter = self.call_application(`

/usr/local/lib/python3.10/dist-packages/webob/request.py:1278 in call_application
    `app_iter = application(self.environ, start_response)`

/usr/local/lib/python3.10/dist-packages/webob/dec.py:143 in __call__
    `return resp(environ, start_response)`

/usr/local/lib/python3.10/dist-packages/routes/middleware.py:153 in __call__
    `response = self.app(environ, start_response)`

/usr/local/lib/python3.10/dist-packages/webob/dec.py:143 in __call__
    `return resp(environ, start_response)`

/usr/local/lib/python3.10/dist-packages/webob/dec.py:129 in __call__
    `resp = self.call_func(req, *args, **kw)`

/usr/local/lib/python3.10/dist-packages/webob/dec.py:193 in call_func
    `return self.func(req, *args, **kwargs)`

/opt/stack/new/nova/nova/api/openstack/wsgi.py:486 in __call__
    `return self._process_stack(request, action, action_args,`

/opt/stack/new/nova/nova/api/openstack/wsgi.py:539 in _process_stack
    `action_result = self.dispatch(meth, request, action_args)`

/opt/stack/new/nova/nova/api/openstack/wsgi.py:624 in dispatch
    `return method(req=request, **action_args)`

/opt/stack/new/nova/nova/api/openstack/wsgi.py:788 in version_select
    `return func.func(self, *args, **kwargs)`

/opt/stack/new/nova/nova/api/openstack/wsgi.py:658 in wrapped
    `return f(*args, **kwargs)`

/opt/stack/new/nova/nova/api/openstack/compute/security_groups.py:287 in create
    `security_group = security_group_api.get(`

/opt/stack/new/nova/nova/network/security_group_api.py:309 in get
    `group = neutron.show_security_group(id).get('security_group')`

/opt/stack/new/nova/nova/network/neutron.py:196 in wrapper
    `ret = obj(*args, **kwargs)`

/usr/local/lib/python3.10/dist-packages/neutronclient/v2_0/client.py:1068 in show_security_group
    `return self.get(self.security_group_path % (security_group),`

/opt/stack/new/nova/nova/network/neutron.py:196 in wrapper
    `ret = obj(*args, **kwargs)`

/usr/local/lib/python3.10/dist-packages/neutronclient/v2_0/client.py:356 in get
    `return self.retry_request("GET", action, body=body,`

/opt/stack/new/nova/nova/network/neutron.py:196 in wrapper
    `ret = obj(*args, **kwargs)`

/usr/local/lib/python3.10/dist-packages/neutronclient/v2_0/client.py:333 in retry_request
    `return self.do_request(method, action, body=body,`

/opt/stack/new/nova/nova/network/neutron.py:196 in wrapper
    `ret = obj(*args, **kwargs)`

/usr/local/lib/python3.10/dist-packages/neutronclient/v2_0/client.py:284 in do_request
    `resp, replybody = self.httpclient.do_request(action, method, body=body,`

/usr/local/lib/python3.10/dist-packages/neutronclient/client.py:342 in do_request
    `return self.request(url, method, **kwargs)`

/usr/local/lib/python3.10/dist-packages/neutronclient/client.py:330 in request
    `resp = super(SessionClient, self).request(*args, **kwargs)`

/usr/local/lib/python3.10/dist-packages/keystoneauth1/adapter.py:257 in request
    `return self.session.request(url, method, **kwargs)`

/usr/local/lib/python3.10/dist-packages/keystoneauth1/session.py:923 in request
    `resp = send(**kwargs)`

/usr/local/lib/python3.10/dist-packages/keystoneauth1/session.py:1014 in _send_request
    `resp = self.session.request(method, url, **kwargs)`

/usr/local/lib/python3.10/dist-packages/requests/sessions.py:587 in request
    `resp = self.send(prep, **send_kwargs)`

/usr/local/lib/python3.10/dist-packages/requests/sessions.py:701 in send
    `r = adapter.send(request, **kwargs)`

/usr/local/lib/python3.10/dist-packages/requests/adapters.py:489 in send
    `resp = conn.urlopen(`

/usr/local/lib/python3.10/dist-packages/urllib3/connectionpool.py:692 in urlopen
    `conn = self._get_conn(timeout=pool_timeout)`

/usr/local/lib/python3.10/dist-packages/urllib3/connectionpool.py:272 in _get_conn
    `if conn and is_connection_dropped(conn):`

/usr/local/lib/python3.10/dist-packages/urllib3/util/connection.py:28 in is_connection_dropped
    `return wait_for_read(sock, timeout=0.0)`

/usr/local/lib/python3.10/dist-packages/urllib3/util/wait.py:145 in wait_for_read
    `return wait_for_socket(sock, read=True, timeout=timeout)`

/usr/local/lib/python3.10/dist-packages/urllib3/util/wait.py:85 in select_wait_for_socket
    `rready, wready, xready = _retry_on_intr(fn, timeout)`

/usr/local/lib/python3.10/dist-packages/urllib3/util/wait.py:43 in _retry_on_intr
    `return fn(timeout)`

/usr/local/lib/python3.10/dist-packages/eventlet/green/select.py:80 in select
    `return hub.switch()`

/usr/local/lib/python3.10/dist-packages/eventlet/hubs/hub.py:313 in switch
    `return self.greenlet.switch()`

[1] https://review.opendev.org/c/openstack/neutron/+/883629
[2] https://storage.gra.cloud.ovh.net/v1/AUTH_dcaab5e32b234d56b626f72581e3644c/zuul_opendev_logs_f01/883629/1/check/grenade-2/f010362/controller/logs/gmrout/osapi_compute_gurumeditation_20230519055138
[3] https://review.opendev.org/c/openstack/neutron/+/883282

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2023-05-19: Related fix proposed to neutron (master)

#9

Related fix proposed to branch: master
Review: https://review.opendev.org/c/openstack/neutron/+/883648

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2023-05-19: Related fix merged to neutron (master)

#10

Reviewed: https://review.opendev.org/c/openstack/neutron/+/883648
Committed: https://opendev.org/openstack/neutron/commit/1d0335810d89ede47cf3b54614382cb25d1986ae
Submitter: "Zuul (22348)"
Branch: master

commit 1d0335810d89ede47cf3b54614382cb25d1986ae
Author: yatinkarel <email address hidden>
Date: Fri May 19 14:59:25 2023 +0530

Disable mysql gather performance in jobs

    We seeing random issue in CI as mentioned
    in the related bug. As per the tests done
    in [1] seems disabling it make the issue
    appear less frequent. Let's try it atleast
    until the root cause is fixed.

[1] https://review.opendev.org/c/openstack/neutron/+/883282

Related-Bug: #2015065
Change-Id: I2738d161d828e8ab0524281d72ed1930e08e194b

Revision history for this message

Balazs Gibizer (balazs-gibizer) wrote on 2023-05-22:

#11

I looked at the stack trace of the blocked thread from https://bugs.launchpad.net/neutron/+bug/2015065/comments/8 (thanks Yatin for collecting the trace!)

Based on https://storage.bhs.cloud.ovh.net/v1/AUTH_dcaab5e32b234d56b626f72581e3644c/zuul_opendev_logs_bdb/882413/2/check/grenade/bdb653c/job-output.txt the environment uses eventlet 0.33.1 and urllib3 1.26.12.

The first interesting step at the stacktrace: /usr/local/lib/python3.10/dist-packages/urllib3/util/connection.py:28 in is_connection_dropped

Which is https://github.com/urllib3/urllib3/blob/a5b29ac1025f9bb30f2c9b756f3b171389c2c039/src/urllib3/connectionpool.py#L272

So urllib try to check if the existing client connection is still usable or got disconnected
https://github.com/urllib3/urllib3/blob/a5b29ac1025f9bb30f2c9b756f3b171389c2c039/src/urllib3/util/connection.py#L28

It calls wait_for_read(sock, timeout=0.0)
So it checks if it can read from the socket with 0.0 timeout

https://github.com/urllib3/urllib3/blob/a5b29ac1025f9bb30f2c9b756f3b171389c2c039/src/urllib3/util/wait.py#L84-L85

That 0.0 timeout is passed to python's select.select
https://docs.python.org/3.10/library/select.html#select.select
"The optional timeout argument specifies a time-out as a floating point number in seconds. When the timeout argument is omitted the function blocks until at least one file descriptor is ready. A time-out value of zero specifies a poll and never blocks."
So that select.select called with 0.0 should never block

BUT

In our env the envtlet monkey patching is changing python's select.select hence the stack trace points to /usr/local/lib/python3.10/dist-packages/eventlet/green/select.py:80 in select

https://github.com/eventlet/eventlet/blob/88ec603404b2ed25c610dead75d4693c7b3e8072/eventlet/green/select.py#L30-L80C32

Looking at that code it seems enventlet sets a timer with the timeout value via hub.schedule_call_global Here I'm getting lost in the eventlet code but I assume sheduling a timer with 0.0 timeout in eventlet can be racy based on the comment in https://github.com/eventlet/eventlet/blob/88ec603404b2ed25c610dead75d4693c7b3e8072/eventlet/green/select.py#L62-L69

So one could argue that what we see is an eventlet bug as select.select with timeout=0.0 should not ever block but it does block in our case.

I looked at the stack trace of the blocked thread from https://bugs.launchpad.net/neutron/+bug/2015065/comments/8 (thanks Yatin for collecting the trace!)

Based on https://storage.bhs.cloud.ovh.net/v1/AUTH_dcaab5e32b234d56b626f72581e3644c/zuul_opendev_logs_bdb/882413/2/check/grenade/bdb653c/job-output.txt the environment uses eventlet 0.33.1 and urllib3 1.26.12.

The first interesting step at the stacktrace: /usr/local/lib/python3.10/dist-packages/urllib3/util/connection.py:28 in is_connection_dropped

Which is https://github.com/urllib3/urllib3/blob/a5b29ac1025f9bb30f2c9b756f3b171389c2c039/src/urllib3/connectionpool.py#L272

So urllib try to check if the existing client connection is still usable or got disconnected
https://github.com/urllib3/urllib3/blob/a5b29ac1025f9bb30f2c9b756f3b171389c2c039/src/urllib3/util/connection.py#L28

It calls wait_for_read(sock, timeout=0.0)
So it checks if it can read from the socket with 0.0 timeout

https://github.com/urllib3/urllib3/blob/a5b29ac1025f9bb30f2c9b756f3b171389c2c039/src/urllib3/util/wait.py#L84-L85

That 0.0 timeout is passed to python's select.select 
https://docs.python.org/3.10/library/select.html#select.select
"The optional timeout argument specifies a time-out as a floating point number in seconds. When the timeout argument is omitted the function blocks until at least one file descriptor is ready. A time-out value of zero specifies a poll and never blocks."
So that select.select called with 0.0 should never block

BUT

In our env the envtlet monkey patching is changing python's select.select hence the stack trace points to /usr/local/lib/python3.10/dist-packages/eventlet/green/select.py:80 in select

https://github.com/eventlet/eventlet/blob/88ec603404b2ed25c610dead75d4693c7b3e8072/eventlet/green/select.py#L30-L80C32

Looking at that code it seems enventlet sets a timer with the timeout value via hub.schedule_call_global Here I'm getting lost in the eventlet code but I assume sheduling a timer with 0.0 timeout in eventlet can be racy based on the comment in https://github.com/eventlet/eventlet/blob/88ec603404b2ed25c610dead75d4693c7b3e8072/eventlet/green/select.py#L62-L69

So one could argue that what we see is an eventlet bug as select.select with timeout=0.0 should not ever block but it does block in our case.

Revision history for this message

Balazs Gibizer (balazs-gibizer) wrote on 2023-05-22 (last edit on 2023-05-22):

#12

I tried to create a pure reproducer but the below code does not hang with eventlet 0.33.1 in py3.10
```
import eventlet

eventlet.monkey_patch()

import socket
import select

def main():
    s1 = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
    s1.connect(("127.0.0.1", 8080))
    print(select.select([s1], [], [], 0.0))

if __name__ == "__main__":
main()
```

Revision history for this message

Balazs Gibizer (balazs-gibizer) wrote on 2023-05-22 (last edit on 2023-05-22):

#13

Forcing a sleep just before https://github.com/eventlet/eventlet/blob/88ec603404b2ed25c610dead75d4693c7b3e8072/eventlet/green/select.py#L80 causes the above reproducer to hang forever. However I'm not sure that forcing a sleep in that eventlet code is a valid reproducer of a loaded system.

Revision history for this message

yatin (yatinkarel) wrote on 2023-05-30:

#14

Download full text (4.5 KiB)

While checking another issue https://bugs.launchpad.net/grenade/+bug/2020643 which is triggered by this stuck issue, i saw one request in neutron too timed out after 900 seconds[1].

Tempest triggered security group delete but it timed out and retried and succeeded in the second attempt:-

2023-05-22 04:28:48.568 69207 WARNING urllib3.connectionpool [-] Retrying (Retry(total=9, connect=None, read=None, redirect=5, status=None)) after connection broken by 'ReadTimeoutError("HTTPConnectionPool(host='173.231.255.73', port=80): Read timed out. (read timeout=60)")': /compute/v2.1/os-security-groups/76a08f39-c1e7-47e5-b52f-c4de53a46146
2023-05-22 04:28:48.707 69207 INFO tempest.lib.common.rest_client [req-1fd0ae50-5f36-40de-9d48-243f74c59335 req-1fd0ae50-5f36-40de-9d48-243f74c59335 ] Request (SecurityGroupRulesTestJSON:tearDownClass): 202 DELETE http://173.231.255.73/compute/v2.1/os-security-groups/76a08f39-c1e7-47e5-b52f-c4de53a46146 60.179s

The original request is stuck in nova/neutron and times out after 900s(client_socket_timeout).

nova:-
May 22 04:27:48.532171 np0034092361 <email address hidden>[51094]: DEBUG nova.api.openstack.wsgi [None req-fc94b238-88f2-4e4b-8a25-9ed456ac652a tempest-SecurityGroupRulesTestJSON-54591405 tempest-SecurityGroupRulesTestJSON-54591405-project-member] Calling method '<function Controller.__getattribute__.<locals>.version_select at 0x7f9129c5dea0>' {{(pid=51094) _process_stack /opt/stack/old/nova/nova/api/openstack/wsgi.py:513}}
May 22 04:42:48.780841 np0034092361 <email address hidden>[51094]: DEBUG neutronclient.v2_0.client [None req-fc94b238-88f2-4e4b-8a25-9ed456ac652a tempest-SecurityGroupRulesTestJSON-54591405 tempest-SecurityGroupRulesTestJSON-54591405-project-member] Error message: {"error": {"code": 401, "title": "Unauthorized", "message": "The request you have made requires authentication."}} {{(pid=51094) _handle_fault_response /usr/local/lib/python3.10/dist-packages/neutronclient/v2_0/client.py:262}}
May 22 04:42:48.781975 np0034092361 <email address hidden>[51094]: INFO nova.api.openstack.requestlog [None req-fc94b238-88f2-4e4b-8a25-9ed456ac652a tempest-SecurityGroupRulesTestJSON-54591405 tempest-SecurityGroupRulesTestJSON-54591405-project-member] 173.231.255.73 "DELETE /compute/v2.1/os-security-groups/76a08f39-c1e7-47e5-b52f-c4de53a46146" status: 500 len: 0 microversion: 2.1 time: 900.252509
May 22 04:42:48.792104 np0034092361 <email address hidden>[51094]: Mon May 22 04:42:48 2023 - SIGPIPE: writing to a closed pipe/socket/fd (probably the client disconnected) on request /compute/v2.1/os-security-groups/76a08f39-c1e7-47e5-b52f-c4de53a46146 (ip 173.231.255.73) !!!
May 22 04:42:48.792104 np0034092361 <email address hidden>[51094]: Mon May 22 04:42:48 2023 - uwsgi_response_writev_headers_and_body_do(): Broken pipe [core/writer.c line 306] during DELETE /compute/v2.1/os-security-groups/76a08f39-c1e7-47e5-b52f-c4de53a46146 (173.231.255.73)
May 22 04:42:48.792104 np0034092361 <email address hidden>[51094]: CRITICAL nova [None req-fc94b238-88f2-4e4b-8a25-9ed456ac652a tempest-SecurityGroupRulesTestJSON-54591405 tempest-SecurityGroupRulesTestJSON-54591405-project-member] Unhandled error: OSError: write error...

neutron

[tempest] "test_security_group_rules_create" unstable in "neutron-ovs-grenade-dvr-multinode"

Bug Description

Other bug subscribers

Remote bug watches