[ERROR] WSREP: FSM: no such a transition JOINED -> JOINED
Affects | Status | Importance | Assigned to | Milestone | ||
---|---|---|---|---|---|---|
MySQL patches by Codership |
New
|
Undecided
|
Unassigned | |||
Percona XtraDB Cluster moved to https://jira.percona.com/projects/PXC | Status tracked in 5.6 | |||||
5.6 |
Fix Released
|
Undecided
|
Unassigned |
Bug Description
Stumbled upon this case when testing a similar customer test scenario but got a different result.
Crashed server after executing "SET GLOBAL WSREP_DESYNC=OFF;" when node was set to WSREP_DESYNC=ON state while SST was in progress.
2016-10-01 10:29:07 6621 [Note] WSREP: Shifting SYNCED -> DONOR/DESYNCED (TO: 104041)
2016-10-01 10:29:07 6621 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.
2016-10-01 10:29:07 6621 [Note] WSREP: Running: 'wsrep_
2016-10-01 10:29:07 6621 [Note] WSREP: sst_donor_thread signaled with 0
WSREP_SST: [INFO] Logging all stderr of SST/Innobackupex to syslog (20161001 10:29:07.221)
2016-10-01 10:29:09 6621 [Note] WSREP: (33d39fca, 'tcp://
2016-10-01 10:29:18 6621 [Note] WSREP: 1.0 (pxc2): State transfer to 2.0 (pxc3) complete.
2016-10-01 10:29:18 6621 [Note] WSREP: Shifting DONOR/DESYNCED -> JOINED (TO: 104041)
2016-10-01 10:29:18 6621 [Note] WSREP: SYNC message ignored as node 1.0 (pxc2) was re-transitioned to DONOR mode before it synced.
2016-10-01 10:29:25 6621 [Note] WSREP: 2.0 (pxc3): State transfer from 1.0 (pxc2) complete.
2016-10-01 10:29:25 6621 [Note] WSREP: Member 2.0 (pxc3) synced with group.
2016-10-01 10:29:39 6621 [Note] WSREP: 1.0 (pxc2): State transfer to 2.0 (pxc3) complete.
2016-10-01 10:29:39 6621 [ERROR] WSREP: FSM: no such a transition JOINED -> JOINED
14:29:39 UTC - mysqld got signal 6 ;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.
We will try our best to scrape up some info that will hopefully help
diagnose the problem, but since we have already crashed,
something is definitely wrong and this may fail.
Please help us make Percona XtraDB Cluster better by reporting any
bugs at https:/
key_buffer_
read_buffer_
max_used_
max_threads=153
thread_count=3
connection_count=1
It is possible that mysqld could use up to
key_buffer_size + (read_buffer_size + sort_buffer_
Hope that's ok; if not, decrease some variables in the equation.
Thread pointer: 0x7fb514000990
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
stack_bottom = 7fb54c6f1998 thread_stack 0x40000
/usr/sbin/
/usr/sbin/
/lib64/
/lib64/
/lib64/
/usr/lib64/
/usr/lib64/
/usr/lib64/
/usr/lib64/
/usr/lib64/
/usr/lib64/
/usr/sbin/
/usr/sbin/
/lib64/
/lib64/
Trying to get some variables.
Some pointers may be invalid and cause the dump to abort.
Query (0): is an invalid pointer
Connection ID (thread ID): 2
Status: NOT_KILLED
You may download the Percona XtraDB Cluster operations manual by visiting
http://
in the manual which will help you identify the cause of the crash.
161001 10:29:39 mysqld_safe Number of processes running now: 0
161001 10:29:39 mysqld_safe WSREP: not restarting wsrep node automatically
161001 10:29:39 mysqld_safe mysqld from pid file /var/lib/
How to repeat:
- Start joiner node to SST from donor node.
- As soon as donor node goes to DONOR\DESYNCED (Shifting SYNCED -> DONOR/DESYNCED) state execute:
select @@global.
+------
| Variable_name | Value |
+------
| wsrep_desync | OFF |
+------
1 row in set (0.00 sec)
Query OK, 0 rows affected (0.00 sec)
+------
| Variable_name | Value |
+------
| wsrep_desync | ON |
+------
1 row in set (0.00 sec)
- Wait until node goes to JOINED (Shifting DONOR/DESYNCED -> JOINED) state and state transfer completes, execute:
set @@global.
PXC:root@
Query OK, 0 rows affected (0.00 sec)
[root@pxc2 ~]# rpm -qa|grep Percona-XtraDB
Percona-
Percona-
Percona-
Percona-
Percona-
I can confirm the donor crashes when wsrep_desync is set during SST session, and later is tried to be turned off.
[7cf5e9886947] {root} ((none)) > select @@version, @@version_ comment\ G ******* ******* ****** 1. row ******* ******* ******* ******
*******
@@version: 5.6.30-76.3-56-log
@@version_comment: Percona XtraDB Cluster (GPL), Release rel76.3, Revision 64742ac, WSREP version 25.16, wsrep_25.16
1 row in set (0.00 sec)
First, donor is chosen to serve full SST, and during the SST was in progress, on donor the explicit desync was executed:
[7cf5e9886947] {root} ((none)) > set global wsrep_desync = ON;
Query OK, 0 rows affected (0.00 sec)
Log from donor:
2016-10-02 13:10:47 10465 [Note] WSREP: Member 1.0 (node3) requested state transfer from 'node1'. Selected 0.0 (node1)(SYNCED) as donor. sst_xtrabackup- v2 --role 'donor' --address '172.17. 0.3:4444/ xtrabackup_ sst//1' --socket '/var/run/ mysqld/ mysqld. sock' --datadir '/var/lib/mysql/' --defaults-file '/etc/my.cnf' --defaults- group-suffix '' --binlog 'mysql-binlog' --gtid '3e6fc59a- 75a5-11e6- 8d27-c6583ca1a8 44:26066' ' 17.0.3: 4444; RC=( ${PIPESTATUS[@]} ) (20161002 13:10:47.448) 0.0.0.0: 4567') turning message relay requesting off file=/etc/ my.cnf --defaults- group=mysqld --no-version-check $tmpopts $INNOEXTRA --galera-info --stream=$sfmt $itmpdir 2>${DATA} /innobackup. backup. log | socat -u stdio TCP:172. 17.0.3: 4444; RC=( ${PIPESTATUS[@]} ) (20161002 13:10:57.460)
2016-10-02 13:10:47 10465 [Note] WSREP: Shifting SYNCED -> DONOR/DESYNCED (TO: 26066)
2016-10-02 13:10:47 10465 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.
2016-10-02 13:10:47 10465 [Note] WSREP: Running: 'wsrep_
2016-10-02 13:10:47 10465 [Note] WSREP: sst_donor_thread signaled with 0
WSREP_SST: [INFO] Streaming with xbstream (20161002 13:10:47.426)
WSREP_SST: [INFO] Using socat as streamer (20161002 13:10:47.428)
WSREP_SST: [INFO] Using /tmp/tmp.85qn89WXgV as xtrabackup temporary directory (20161002 13:10:47.441)
WSREP_SST: [INFO] Using /tmp/tmp.vOdDwTkzdD as innobackupex temporary directory (20161002 13:10:47.443)
WSREP_SST: [INFO] Streaming GTID file before SST (20161002 13:10:47.447)
WSREP_SST: [INFO] Evaluating xbstream -c ${INFO_FILE} | socat -u stdio TCP:172.
WSREP_SST: [INFO] Sleeping before data transfer for SST (20161002 13:10:47.455)
2016-10-02 13:10:49 10465 [Note] WSREP: (864f40f1, 'tcp://
WSREP_SST: [INFO] Streaming the backup to joiner at 172.17.0.3 4444 (20161002 13:10:57.457)
WSREP_SST: [INFO] Evaluating innobackupex --defaults-
2016-10-02 13:11:08 10465 [Note] WSREP: 0.0 (node1): State transfer to 1.0 (node3) complete.
2016-10-02 13:11:08 10465 [Note] WSREP: Shifting DONOR/DESYNCED -> JOINED (TO: 26066)
2016-10-02 13:11:08 10465 [Note] WSREP: SYNC message ignored as node 0.0 (node1) was re-transitioned to DONOR mode before it synced.
WSREP_SST: [INFO] Total time on donor: 0 seconds (20161002 13:11:08.717)
WSREP_SST: [INFO] Cleaning up temporary directories (20161002 13:11:08.719)
2016-10-02 13:11:43 10465 [Note] WSREP: 1.0 (node3): State transfer from 0.0 (node1) complete.
2016-10-02 13:11:43 10465 [Note] WSREP: Member 1.0 (node3) synced with group.
After that, donor state:
[7cf5e9886947] {ro...