Controller redeployment can lead to nailgun DB deadlock
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
Fuel for OpenStack |
Fix Released
|
High
|
Georgy Kibardin | ||
Future |
Fix Committed
|
High
|
Georgy Kibardin | ||
Newton |
Fix Released
|
High
|
Georgy Kibardin |
Bug Description
Steps to reproduce:
1) deploy cluster:
4 nodes with controller
1 Cinder node
2) Delete primary controller.
3) Add another one controller.
4) Re-deploy cluster.
Expected Result:
Cluster ready after re-deployment
Actual:
Redeployment fail with error
2016-09-16 16:27:17 ERROR [7f1886631880] (helpers) Extracting of actor_id failed
Traceback (most recent call last):
File "/usr/lib/
actor_id = action_log.actor_id
AttributeError: 'NoneType' object has no attribute 'actor_id'
2016-09-16 15:42:06 ERROR [7f1886631880] (base) Unexpected exception occured
Traceback (most recent call last):
File "/usr/lib/
return func(cls, *args, **kwargs)
File "<string>", line 2, in PUT
File "/usr/lib/
return func(cls, *args, **kwargs)
File "<string>", line 2, in PUT
File "/usr/lib/
resp = func(cls, *args, **kwargs)
File "/usr/lib/
self.
File "/usr/lib/
return cls.update(
File "/usr/lib/
super(Node, cls).update(
File "/usr/lib/
db().flush()
File "/usr/lib64/
self.
File "/usr/lib64/
transaction
File "/usr/lib64/
compat.
File "/usr/lib64/
flush_
File "/usr/lib64/
rec.
File "/usr/lib64/
uow
File "/usr/lib64/
mapper, table, update)
File "/usr/lib64/
execute(
File "/usr/lib64/
return meth(self, multiparams, params)
File "/usr/lib64/
return connection.
File "/usr/lib64/
compiled_sql, distilled_params
File "/usr/lib64/
context)
File "/usr/lib64/
exc_info
File "/usr/lib64/
reraise(
File "/usr/lib64/
context)
File "/usr/lib64/
cursor.
OperationalError: (psycopg2.
DETAIL: Process 8983 waits for ShareLock on transaction 2768; blocked by process 26677.
Process 26677 waits for ShareLock on transaction 2769; blocked by process 8983.
HINT: See server log for query details.
CONTEXT: SQL statement "SELECT 1 FROM ONLY "public"."clusters" x WHERE "id" OPERATOR(
[SQL: 'UPDATE nodes SET mac=%(mac)s, agent_checksum=
[pid: 7727|app: 0|req: 440/790] 10.109.10.11 () {40 vars in 559 bytes} [Fri Sep 16 15:42:05 2016] PUT /api/nodes/agent/ => generated 39 bytes in 1294 msecs (HTTP/1.1 500) 5 headers in 223 bytes (2 switches on core 0)
[pid: 7727|app: 0|req: 441/791] 10.109.10.1 () {38 vars in 554 bytes} [Fri Sep 16 15:42:07 2016] GET /api/tasks/1 => generated 276 bytes in 8 msecs (HTTP/1.1 200) 4 headers in 185 bytes (2 switches on core 0)
Changed in fuel: | |
assignee: | Fuel QA Team (fuel-qa) → Fuel Sustaining (fuel-sustaining-team) |
Changed in fuel: | |
milestone: | none → 9.1 |
status: | New → Confirmed |
description: | updated |
Changed in fuel: | |
assignee: | Fuel Sustaining (fuel-sustaining-team) → Dmitry Guryanov (dguryanov) |
Changed in fuel: | |
assignee: | Dmitry Guryanov (dguryanov) → Fuel Sustaining (fuel-sustaining-team) |
tags: |
added: area-python removed: area-library |
summary: |
- fail re-deploys after replacing the main controller + Controller redeployment can lead to nailgun DB deadlock |
Changed in fuel: | |
assignee: | Fuel Sustaining (fuel-sustaining-team) → Georgy Kibardin (gkibardin) |
Changed in fuel: | |
milestone: | 9.2 → 9.3 |
Changed in fuel: | |
status: | New → Confirmed |
Changed in fuel: | |
milestone: | 9.x-updates → 9.2-mu-1 |
Changed in fuel: | |
status: | In Progress → Fix Committed |
@Valentyn Yakovlev, please specify fuel version and return the issue back to New state.