SHOW STATUS blocked when wsrep_local_recv_queue>0 and wsrep_sync_wait=1
Affects | Status | Importance | Assigned to | Milestone | ||
---|---|---|---|---|---|---|
Percona XtraDB Cluster moved to https://jira.percona.com/projects/PXC | Status tracked in 5.6 | |||||
5.5 |
Confirmed
|
Wishlist
|
Unassigned | |||
5.6 |
Confirmed
|
Wishlist
|
Unassigned |
Bug Description
A node who reached max recv_queue and sending Flow Control pause messages, is blocked from executing SHOW STATUS commands until Flow Control is released. This makes monitoring of this host impossible.
Also SHOW STATUS commands are increasing the 'wsrep_
Happens on both PXC 5.6.21 and 5.5.39.
percona10 mysql> select @@version,
+------
| @@version | @@version_comment |
+------
| 5.6.21-70.1-56-log | Percona XtraDB Cluster (GPL), Release rel70.1, Revision 939, WSREP version 25.8, wsrep_25.8.r4150 |
+------
1 row in set (0.00 sec)
percona10 mysql> select @@wsrep_
+------
| @@wsrep_
+------
| 1 | 1 |
+------
1 row in set (0.01 sec)
percona10 mysql> show status like 'wsrep_
+------
| Variable_name | Value |
+------
| wsrep_causal_reads | 31 |
+------
1 row in set (0.00 sec)
percona10 mysql> show status like 'wsrep_
+------
| Variable_name | Value |
+------
| wsrep_causal_reads | 32 |
+------
1 row in set (0.00 sec)
percona30 mysql> select @@wsrep_
+------
| @@wsrep_
+------
| 1 | 1 |
+------
1 row in set (0.00 sec)
-- block one node for writes:
percona30 mysql> flush tables with read lock;
Query OK, 0 rows affected (0.00 sec)
-- run some write workload on other nodes to trigger Flow Control:
percona20 mysql> show processlist;
+----+-
| Id | User | Host | db | Command | Time | State | Info | Rows_sent | Rows_examined |
+----+-
| 1 | system user | | NULL | Sleep | 162 | NULL | NULL | 0 | 0 |
----
| 12 | root | localhost | NULL | Query | 0 | init | show processlist | 0 | 0 |
| 13 | app | 192.168.3.1:59781 | sbtest | Query | 24 | wsrep in pre-commit stage | COMMIT | 0 | 0 |
| 14 | app | 192.168.3.1:59780 | sbtest | Query | 24 | wsrep in pre-commit stage | COMMIT | 0 | 0 |
| 15 | app | 192.168.3.1:59782 | sbtest | Query | 24 | wsrep in pre-commit stage | COMMIT | 0 | 0 |
| 16 | app | 192.168.3.1:59777 | sbtest | Query | 24 | wsrep in pre-commit stage | COMMIT | 0 | 0 |
| 17 | app | 192.168.3.1:59776 | sbtest | Query | 24 | wsrep in pre-commit stage | COMMIT | 0 | 0 |
| 18 | app | 192.168.3.1:59774 | sbtest | Query | 24 | wsrep in pre-commit stage | COMMIT | 0 | 0 |
| 19 | app | 192.168.3.1:59775 | sbtest | Query | 24 | wsrep in pre-commit stage | COMMIT | 0 | 0 |
| 20 | app | 192.168.3.1:59778 | sbtest | Query | 24 | wsrep in pre-commit stage | COMMIT | 0 | 0 |
+----+-
20 rows in set (0.00 sec)
-- show commands work on nodes where wsrep_local_
percona10 mysql> show status like 'ws%flo%';
+------
| Variable_name | Value |
+------
| wsrep_flow_
| wsrep_flow_
| wsrep_flow_
| wsrep_flow_
+------
4 rows in set (0.00 sec)
percona10 mysql> show status like 'ws%recv_queue%';
+------
| Variable_name | Value |
+------
| wsrep_local_
| wsrep_local_
| wsrep_local_
| wsrep_local_
+------
4 rows in set (0.00 sec)
percona20 mysql> show status like 'ws%flo%';
+------
| Variable_name | Value |
+------
| wsrep_flow_
| wsrep_flow_
| wsrep_flow_
| wsrep_flow_
+------
4 rows in set (0.00 sec)
-- but show status commands are blocked on the node where wsrep_local_
percona30 mysql> show processlist;
+----+-
| Id | User | Host | db | Command | Time | State | Info | Rows_sent | Rows_examined |
+----+-
| 1 | system user | | NULL | Sleep | 52 | NULL | NULL | 0 | 0 |
| 2 | system user | | NULL | Sleep | 52 | wsrep aborter idle | NULL | 0 | 0 |
| 3 | system user | | NULL | Sleep | 50 | NULL | NULL | 0 | 0 |
| 4 | system user | | NULL | Sleep | 50 | NULL | NULL | 0 | 0 |
| 5 | system user | | NULL | Sleep | 50 | NULL | NULL | 0 | 0 |
| 6 | root | localhost | NULL | Query | 0 | init | show processlist | 0 | 0 |
+----+-
6 rows in set (0.00 sec)
percona30 mysql> show status like 'ws%flo%';
ERROR 1205 (HY000): Lock wait timeout exceeded; try restarting transaction
percona30 mysql> show status like 'ha%';
ERROR 1205 (HY000): Lock wait timeout exceeded; try restarting transaction
percona30 mysql> show global status like 'ws%flo%';
ERROR 1205 (HY000): Lock wait timeout exceeded; try restarting transaction
I apologize, it seems I misunderstood the status of related former reports:
lp:1126316 and lp:1271177
as if this was fixed already and blocking should no longer occur.
However it seems that this behavior for SHOW STATUS is intended when causal reads are set, and one should disable it per session to make SHOW STATUS to work under Flow Control condition.
I don't understand why blocking of reading the status variables is needed for causal reads, but if this is indeed necessary and intended, let's close this one with duplicate status.
Anyways, the workaround and the only method for any monitoring tools is to do:
percona30 mysql> set session wsrep_sync_wait=0;
Query OK, 0 rows affected (0.00 sec)
percona30 mysql> show status like 'ws%recv_q%'; ------- ------- ------- -+----- ------+ ------- ------- ------- -+----- ------+ recv_queue | 29 | recv_queue_ max | 30 | recv_queue_ min | 0 | recv_queue_ avg | 12.584416 | ------- ------- ------- -+----- ------+
+------
| Variable_name | Value |
+------
| wsrep_local_
| wsrep_local_
| wsrep_local_
| wsrep_local_
+------
4 rows in set (0.00 sec)