There's deadlock in check_quota_met that can occur under the following conditions:
- the Quota entry of the project doesn't exist yet (first POST call to the Octavia API by a tenant)
- 2 resources are created at the same time
It may not impact production environments as the deadlock may occur only in the 2 first (simultaneous) API calls to Octavia in a project.
But it may happen in CI as the tests are executed in parallel in a newly created project.
After the first API call to Octavia, the Quota entry exists in the DB, and deadlock is not reproducible
To reproduce it, I run a loop with:
- delete the quota entry from the DB
- create 8 load balancers simultaneously
- if the deadlock occurs, an exception is raised
- delete the load balancers
It needs only a few iterations of the loop to reproduce it.
[..]
Oct 08 10:28:51 gthiemon-devstack <email address hidden>[2890367]: DEBUG octavia.db.repositories [None req-fdc9e4bb-63a5-4383-b64c-25d9ac950604 demo admin] Checking quota for project: 0758ab98a0d9459a958bb7acf51c4805 object: <class 'octavia.common.data_models.LoadBalancer'> {{(pid=2890367) check_quota_met /opt/stack/octavia/octavia/db/repositories.py:382}}
Oct 08 10:28:51 gthiemon-devstack <email address hidden>[2890366]: DEBUG octavia.db.repositories [None req-bbe9492f-aa48-43e2-8d1e-6531df04be5f demo admin] Checking quota for project: 0758ab98a0d9459a958bb7acf51c4805 object: <class 'octavia.common.data_models.LoadBalancer'> {{(pid=2890366) check_quota_met /opt/stack/octavia/octavia/db/repositories.py:382}}
Oct 08 10:28:51 gthiemon-devstack <email address hidden>[2890366]: DEBUG oslo_db.api [None req-bbe9492f-aa48-43e2-8d1e-6531df04be5f demo admin] DB error: (pymysql.err.OperationalError) (1213, 'Deadlock found when trying to get lock; try restarting transaction')
Oct 08 10:28:51 gthiemon-devstack <email address hidden>[2890366]: [SQL: INSERT INTO quotas (project_id, health_monitor, listener, load_balancer, member, pool, l7policy, l7rule, in_use_health_monitor, in_use_listener, in_use_load_balancer, in_use_member, in_use_pool, in_use_l7policy, in_use_l7rule) VALUES (%(project_id)s, %(health_monitor)s, %(listener)s, %(load_balancer)s, %(member)s, %(pool)s, %(l7policy)s, %(l7rule)s, %(in_use_health_monitor)s, %(in_use_listener)s, %(in_use_load_balancer)s, %(in_use_member)s, %(in_use_pool)s, %(in_use_l7policy)s, %(in_use_l7rule)s)]
Oct 08 10:28:51 gthiemon-devstack <email address hidden>[2890366]: [parameters: {'project_id': '0758ab98a0d9459a958bb7acf51c4805', 'health_monitor': None, 'listener': None, 'load_balancer': None, 'member': None, 'pool': None, 'l7policy': None, 'l7rule': None, 'in_use_health_monitor': None, 'in_use_listener': None, 'in_use_load_balancer': None, 'in_use_member': None, 'in_use_pool': None, 'in_use_l7policy': None, 'in_use_l7rule': None}]
Oct 08 10:28:51 gthiemon-devstack <email address hidden>[2890366]: (Background on this error at: https://sqlalche.me/e/14/e3q8) {{(pid=2890366) _is_exception_expected /usr/local/lib/python3.9/site-packages/oslo_db/api.py:184}}
Oct 08 10:28:51 gthiemon-devstack <email address hidden>[2890366]: DEBUG oslo_db.api [None req-bbe9492f-aa48-43e2-8d1e-6531df04be5f demo admin] Performing DB retry for function octavia.db.repositories.QuotasRepository.update {{(pid=2890366) wrapper /usr/local/lib/python3.9/site-packages/oslo_db/api.py:155}}
[..]
Oct 08 10:28:52 gthiemon-devstack <email address hidden>[2890367]: INFO octavia.api.drivers.amphora_driver.v2.driver [None req-fdc9e4bb-63a5-4383-b64c-25d9ac950604 demo admin] Amphora provider created VIP port d4a6b6af-b82f-4e79-a805-3bcc2734c143 for load balancer 9b9adabb-405b-40fd-92ce-929ca74efbc8.
Oct 08 10:28:52 gthiemon-devstack <email address hidden>[2890367]: INFO octavia.api.v2.controllers.load_balancer [None req-fdc9e4bb-63a5-4383-b64c-25d9ac950604 demo admin] Sending create Load Balancer 9b9adabb-405b-40fd-92ce-929ca74efbc8 to provider amphora
Oct 08 10:28:52 gthiemon-devstack octavia-worker[2890454]: INFO octavia.controller.queue.v2.endpoints [-] Creating load balancer '9b9adabb-405b-40fd-92ce-929ca74efbc8'...
Oct 08 10:28:52 gthiemon-devstack <email address hidden>[2890367]: [pid: 2890367|app: 0|req: 16/35] 192.168.1.101 () {66 vars in 1335 bytes} [Sun Oct 8 10:28:51 2023] POST /load-balancer/v2.0/lbaas/loadbalancers => generated 717 bytes in 562 msecs (HTTP/1.1 201) 4 headers in 162 bytes (1 switches on core 0)
Oct 08 10:28:52 gthiemon-devstack octavia-worker[2890454]: DEBUG octavia.common.base_taskflow [-] Waiting for job get_create_load_balancer_flow-46bdb5e7-06c8-43a1-8f9e-240e44493431 to finish {{(pid=2890454) _wait_for_job /opt/stack/octavia/octavia/common/base_taskflow.py:203}}
Oct 08 10:28:52 gthiemon-devstack <email address hidden>[2890367]: DEBUG octavia.db.repositories [None req-0e9a0198-0bd6-41ce-ad42-9e5ec14f093c demo admin] Checking quota for project: 0758ab98a0d9459a958bb7acf51c4805 object: <class 'octavia.common.data_models.LoadBalancer'> {{(pid=2890367) check_quota_met /opt/stack/octavia/octavia/db/repositories.py:382}}
Oct 08 10:28:52 gthiemon-devstack <email address hidden>[2890367]: INFO octavia.api.drivers.amphora_driver.v2.driver [None req-0e9a0198-0bd6-41ce-ad42-9e5ec14f093c demo admin] Amphora provider created VIP port 15318a06-bea4-4955-ac65-f5fe0ccc252a for load balancer 23e87030-1708-422b-b7f4-06aa2911ee4c.
Oct 08 10:28:52 gthiemon-devstack <email address hidden>[2890367]: INFO octavia.api.v2.controllers.load_balancer [None req-0e9a0198-0bd6-41ce-ad42-9e5ec14f093c demo admin] Sending create Load Balancer 23e87030-1708-422b-b7f4-06aa2911ee4c to provider amphora
Oct 08 10:28:52 gthiemon-devstack <email address hidden>[2890367]: [pid: 2890367|app: 0|req: 17/36] 192.168.1.101 () {66 vars in 1335 bytes} [Sun Oct 8 10:28:52 2023] POST /load-balancer/v2.0/lbaas/loadbalancers => generated 717 bytes in 478 msecs (HTTP/1.1 201) 4 headers in 162 bytes (1 switches on core 0)
Oct 08 10:28:52 gthiemon-devstack <email address hidden>[2890366]: ERROR wsme.api [None req-bbe9492f-aa48-43e2-8d1e-6531df04be5f demo admin] Server-side error: "This Session's transaction has been rolled back due to a previous exception during flush. To begin a new transaction with this Session, first issue Session.rollback(). Original exception was: (pymysql.err.OperationalError) (1213, 'Deadlock found when trying to get lock; try restarting transaction')
Oct 08 10:28:52 gthiemon-devstack <email address hidden>[2890366]: [SQL: INSERT INTO quotas (project_id, health_monitor, listener, load_balancer, member, pool, l7policy, l7rule, in_use_health_monitor, in_use_listener, in_use_load_balancer, in_use_member, in_use_pool, in_use_l7policy, in_use_l7rule) VALUES (%(project_id)s, %(health_monitor)s, %(listener)s, %(load_balancer)s, %(member)s, %(pool)s, %(l7policy)s, %(l7rule)s, %(in_use_health_monitor)s, %(in_use_listener)s, %(in_use_load_balancer)s, %(in_use_member)s, %(in_use_pool)s, %(in_use_l7policy)s, %(in_use_l7rule)s)]
Oct 08 10:28:52 gthiemon-devstack <email address hidden>[2890366]: [parameters: {'project_id': '0758ab98a0d9459a958bb7acf51c4805', 'health_monitor': None, 'listener': None, 'load_balancer': None, 'member': None, 'pool': None, 'l7policy': None, 'l7rule': None, 'in_use_health_monitor': None, 'in_use_listener': None, 'in_use_load_balancer': None, 'in_use_member': None, 'in_use_pool': None, 'in_use_l7policy': None, 'in_use_l7rule': None}]
Oct 08 10:28:52 gthiemon-devstack <email address hidden>[2890366]: (Background on this error at: https://sqlalche.me/e/14/e3q8) (Background on this error at: https://sqlalche.me/e/14/7s2a)". Detail:
Oct 08 10:28:52 gthiemon-devstack <email address hidden>[2890366]: Traceback (most recent call last):
Oct 08 10:28:52 gthiemon-devstack <email address hidden>[2890366]: File "/usr/local/lib/python3.9/site-packages/wsmeext/pecan.py", line 82, in callfunction
Oct 08 10:28:52 gthiemon-devstack <email address hidden>[2890366]: result = f(self, *args, **kwargs)
Oct 08 10:28:52 gthiemon-devstack <email address hidden>[2890366]: File "/opt/stack/octavia/octavia/api/v2/controllers/load_balancer.py", line 577, in post
Oct 08 10:28:52 gthiemon-devstack <email address hidden>[2890366]: lock_session.rollback()
Oct 08 10:28:52 gthiemon-devstack <email address hidden>[2890366]: File "/usr/local/lib/python3.9/site-packages/oslo_utils/excutils.py", line 227, in __exit__
Oct 08 10:28:52 gthiemon-devstack <email address hidden>[2890366]: self.force_reraise()
Oct 08 10:28:52 gthiemon-devstack <email address hidden>[2890366]: File "/usr/local/lib/python3.9/site-packages/oslo_utils/excutils.py", line 200, in force_reraise
Oct 08 10:28:52 gthiemon-devstack <email address hidden>[2890366]: raise self.value
Oct 08 10:28:52 gthiemon-devstack <email address hidden>[2890366]: File "/opt/stack/octavia/octavia/api/v2/controllers/load_balancer.py", line 467, in post
Oct 08 10:28:52 gthiemon-devstack <email address hidden>[2890366]: if self.repositories.check_quota_met(
Oct 08 10:28:52 gthiemon-devstack <email address hidden>[2890366]: File "/opt/stack/octavia/octavia/db/repositories.py", line 396, in check_quota_met
Oct 08 10:28:52 gthiemon-devstack <email address hidden>[2890366]: self.quotas.update(
Oct 08 10:28:52 gthiemon-devstack <email address hidden>[2890366]: File "/usr/local/lib/python3.9/site-packages/oslo_db/api.py", line 154, in wrapper
Oct 08 10:28:52 gthiemon-devstack <email address hidden>[2890366]: ectxt.value = e.inner_exc
Oct 08 10:28:52 gthiemon-devstack <email address hidden>[2890366]: File "/usr/local/lib/python3.9/site-packages/oslo_utils/excutils.py", line 227, in __exit__
Oct 08 10:28:52 gthiemon-devstack <email address hidden>[2890366]: self.force_reraise()
Oct 08 10:28:52 gthiemon-devstack <email address hidden>[2890366]: File "/usr/local/lib/python3.9/site-packages/oslo_utils/excutils.py", line 200, in force_reraise
Oct 08 10:28:52 gthiemon-devstack <email address hidden>[2890366]: raise self.value
Oct 08 10:28:52 gthiemon-devstack <email address hidden>[2890366]: File "/usr/local/lib/python3.9/site-packages/oslo_db/api.py", line 142, in wrapper
Oct 08 10:28:52 gthiemon-devstack <email address hidden>[2890366]: return f(*args, **kwargs)
Oct 08 10:28:52 gthiemon-devstack <email address hidden>[2890366]: File "/opt/stack/octavia/octavia/db/repositories.py", line 1895, in update
Oct 08 10:28:52 gthiemon-devstack <email address hidden>[2890366]: session.query(self.model_class)
Oct 08 10:28:52 gthiemon-devstack <email address hidden>[2890366]: File "/usr/local/lib64/python3.9/site-packages/sqlalchemy/orm/query.py", line 2823, in first
Oct 08 10:28:52 gthiemon-devstack <email address hidden>[2890366]: return self.limit(1)._iter().first()
Oct 08 10:28:52 gthiemon-devstack <email address hidden>[2890366]: File "/usr/local/lib64/python3.9/site-packages/sqlalchemy/orm/query.py", line 2907, in _iter
Oct 08 10:28:52 gthiemon-devstack <email address hidden>[2890366]: result = self.session.execute(
Oct 08 10:28:52 gthiemon-devstack <email address hidden>[2890366]: File "/usr/local/lib64/python3.9/site-packages/sqlalchemy/orm/session.py", line 1711, in execute
Oct 08 10:28:52 gthiemon-devstack <email address hidden>[2890366]: conn = self._connection_for_bind(bind)
Oct 08 10:28:52 gthiemon-devstack <email address hidden>[2890366]: File "/usr/local/lib64/python3.9/site-packages/sqlalchemy/orm/session.py", line 1552, in _connection_for_bind
Oct 08 10:28:52 gthiemon-devstack <email address hidden>[2890366]: return self._transaction._connection_for_bind(
Oct 08 10:28:52 gthiemon-devstack <email address hidden>[2890366]: File "/usr/local/lib64/python3.9/site-packages/sqlalchemy/orm/session.py", line 721, in _connection_for_bind
Oct 08 10:28:52 gthiemon-devstack <email address hidden>[2890366]: self._assert_active()
Oct 08 10:28:52 gthiemon-devstack <email address hidden>[2890366]: File "/usr/local/lib64/python3.9/site-packages/sqlalchemy/orm/session.py", line 601, in _assert_active
Oct 08 10:28:52 gthiemon-devstack <email address hidden>[2890366]: raise sa_exc.PendingRollbackError(
Oct 08 10:28:52 gthiemon-devstack <email address hidden>[2890366]: sqlalchemy.exc.PendingRollbackError: This Session's transaction has been rolled back due to a previous exception during flush. To begin a new transaction with this Session, first issue Session.rollback(). Original exception was: (pymysql.err.OperationalError) (1213, 'Deadlock found when trying to get lock; try restarting transaction')
Oct 08 10:28:52 gthiemon-devstack <email address hidden>[2890366]: [SQL: INSERT INTO quotas (project_id, health_monitor, listener, load_balancer, member, pool, l7policy, l7rule, in_use_health_monitor, in_use_listener, in_use_load_balancer, in_use_member, in_use_pool, in_use_l7policy, in_use_l7rule) VALUES (%(project_id)s, %(health_monitor)s, %(listener)s, %(load_balancer)s, %(member)s, %(pool)s, %(l7policy)s, %(l7rule)s, %(in_use_health_monitor)s, %(in_use_listener)s, %(in_use_load_balancer)s, %(in_use_member)s, %(in_use_pool)s, %(in_use_l7policy)s, %(in_use_l7rule)s)]
Oct 08 10:28:52 gthiemon-devstack <email address hidden>[2890366]: [parameters: {'project_id': '0758ab98a0d9459a958bb7acf51c4805', 'health_monitor': None, 'listener': None, 'load_balancer': None, 'member': None, 'pool': None, 'l7policy': None, 'l7rule': None, 'in_use_health_monitor': None, 'in_use_listener': None, 'in_use_load_balancer': None, 'in_use_member': None, 'in_use_pool': None, 'in_use_l7policy': None, 'in_use_l7rule': None}]
Oct 08 10:28:52 gthiemon-devstack <email address hidden>[2890366]: (Background on this error at: https://sqlalche.me/e/14/e3q8) (Background on this error at: https://sqlalche.me/e/14/7s2a)
Oct 08 10:28:52 gthiemon-devstack <email address hidden>[2890366]: : sqlalchemy.exc.PendingRollbackError: This Session's transaction has been rolled back due to a previous exception during flush. To begin a new transaction with this Session, first issue Session.rollback(). Original exception was: (pymysql.err.OperationalError) (1213, 'Deadlock found when trying to get lock; try restarting transaction')
Oct 08 10:28:52 gthiemon-devstack <email address hidden>[2890366]: [pid: 2890366|app: 0|req: 20/37] 192.168.1.101 () {66 vars in 1335 bytes} [Sun Oct 8 10:28:51 2023] POST /load-balancer/v2.0/lbaas/loadbalancers => generated 1366 bytes in 1147 msecs (HTTP/1.1 500) 4 headers in 177 bytes (1 switches on core 0)
Note: it happened in the CI in octavia- v2-dsvm- scenario- non-traffic- ops
https:/ /review. opendev. org/c/openstack /octavia/ +/897519/ 4
https:/ /zuul.opendev. org/t/openstack /build/ 5753eeeb674f435 98e9b0e5cf87145 20