First, donor is chosen to serve full SST, and during the SST was in progress, on donor the explicit desync was executed:
[7cf5e9886947] {root} ((none)) > set global wsrep_desync = ON;
Query OK, 0 rows affected (0.00 sec)
Log from donor:
2016-10-02 13:10:47 10465 [Note] WSREP: Member 1.0 (node3) requested state transfer from 'node1'. Selected 0.0 (node1)(SYNCED) as donor.
2016-10-02 13:10:47 10465 [Note] WSREP: Shifting SYNCED -> DONOR/DESYNCED (TO: 26066)
2016-10-02 13:10:47 10465 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.
2016-10-02 13:10:47 10465 [Note] WSREP: Running: 'wsrep_sst_xtrabackup-v2 --role 'donor' --address '172.17.0.3:4444/xtrabackup_sst//1' --socket '/var/run/mysqld/mysqld.sock' --datadir '/var/lib/mysql/' --defaults-file '/etc/my.cnf' --defaults-group-suffix '' --binlog 'mysql-binlog' --gtid '3e6fc59a-75a5-11e6-8d27-c6583ca1a844:26066''
2016-10-02 13:10:47 10465 [Note] WSREP: sst_donor_thread signaled with 0
WSREP_SST: [INFO] Streaming with xbstream (20161002 13:10:47.426)
WSREP_SST: [INFO] Using socat as streamer (20161002 13:10:47.428)
WSREP_SST: [INFO] Using /tmp/tmp.85qn89WXgV as xtrabackup temporary directory (20161002 13:10:47.441)
WSREP_SST: [INFO] Using /tmp/tmp.vOdDwTkzdD as innobackupex temporary directory (20161002 13:10:47.443)
WSREP_SST: [INFO] Streaming GTID file before SST (20161002 13:10:47.447)
WSREP_SST: [INFO] Evaluating xbstream -c ${INFO_FILE} | socat -u stdio TCP:172.17.0.3:4444; RC=( ${PIPESTATUS[@]} ) (20161002 13:10:47.448)
WSREP_SST: [INFO] Sleeping before data transfer for SST (20161002 13:10:47.455)
2016-10-02 13:10:49 10465 [Note] WSREP: (864f40f1, 'tcp://0.0.0.0:4567') turning message relay requesting off
WSREP_SST: [INFO] Streaming the backup to joiner at 172.17.0.3 4444 (20161002 13:10:57.457)
WSREP_SST: [INFO] Evaluating innobackupex --defaults-file=/etc/my.cnf --defaults-group=mysqld --no-version-check $tmpopts $INNOEXTRA --galera-info --stream=$sfmt $itmpdir 2>${DATA}/innobackup.backup.log | socat -u stdio TCP:172.17.0.3:4444; RC=( ${PIPESTATUS[@]} ) (20161002 13:10:57.460)
2016-10-02 13:11:08 10465 [Note] WSREP: 0.0 (node1): State transfer to 1.0 (node3) complete.
2016-10-02 13:11:08 10465 [Note] WSREP: Shifting DONOR/DESYNCED -> JOINED (TO: 26066)
2016-10-02 13:11:08 10465 [Note] WSREP: SYNC message ignored as node 0.0 (node1) was re-transitioned to DONOR mode before it synced.
WSREP_SST: [INFO] Total time on donor: 0 seconds (20161002 13:11:08.717)
WSREP_SST: [INFO] Cleaning up temporary directories (20161002 13:11:08.719)
2016-10-02 13:11:43 10465 [Note] WSREP: 1.0 (node3): State transfer from 0.0 (node1) complete.
2016-10-02 13:11:43 10465 [Note] WSREP: Member 1.0 (node3) synced with group.
After that, donor state:
[7cf5e9886947] {root} ((none)) > show status like 'wsrep_local_state_comment';
+---------------------------+--------+
| Variable_name | Value |
+---------------------------+--------+
| wsrep_local_state_comment | Joined |
+---------------------------+--------+
1 row in set (0.00 sec)
So when tried to again explicitly disable desync:
[7cf5e9886947] {root} ((none)) > set global wsrep_desync = 0;
Query OK, 0 rows affected (0.00 sec)
[7cf5e9886947] {root} ((none)) > show status like 'wsrep_local_state_comment';
ERROR 2006 (HY000): MySQL server has gone away
No connection. Trying to reconnect...
ERROR 2002 (HY000): Can't connect to local MySQL server through socket '/var/run/mysqld/mysqld.sock' (111)
ERROR:
Can't connect to the server
Further donor's log:
2016-10-02 13:13:28 10465 [Note] WSREP: 0.0 (node1): State transfer to 1.0 (node3) complete.
2016-10-02 13:13:28 10465 [ERROR] WSREP: FSM: no such a transition JOINED -> JOINED
13:13:28 UTC - mysqld got signal 6 ;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.
We will try our best to scrape up some info that will hopefully help
diagnose the problem, but since we have already crashed,
something is definitely wrong and this may fail.
Please help us make Percona XtraDB Cluster better by reporting any
bugs at https://bugs.launchpad.net/percona-xtradb-cluster
key_buffer_size=25165824
read_buffer_size=131072
max_used_connections=2
max_threads=202
thread_count=3
connection_count=1
It is possible that mysqld could use up to
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 105041 K bytes of memory
Hope that's ok; if not, decrease some variables in the equation.
Thread pointer: 0x7f219c000990
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
stack_bottom = 7f21c16bb998 thread_stack 0x40000
/usr/sbin/mysqld(my_print_stacktrace+0x35)[0x907295]
/usr/sbin/mysqld(handle_fatal_signal+0x4b4)[0x66b074]
/lib64/libpthread.so.0(+0xf7e0)[0x7f21c14057e0]
/lib64/libc.so.6(gsignal+0x35)[0x7f21bf63e5e5]
/lib64/libc.so.6(abort+0x175)[0x7f21bf63fdc5]
/usr/lib64/libgalera_smm.so(_ZN6galera3FSMINS_10Replicator5StateENS_13ReplicatorSMM10TransitionENS_10EmptyGuardENS3_11StateActionEE8shift_toES2_+0x212)[0x7f21bd5e4302]
/usr/lib64/libgalera_smm.so(_ZN6galera13ReplicatorSMM12process_joinEll+0x178)[0x7f21bd5df478]
/usr/lib64/libgalera_smm.so(_ZN6galera15GcsActionSource8dispatchEPvRK10gcs_actionRb+0x21d)[0x7f21bd5ba87d]
/usr/lib64/libgalera_smm.so(_ZN6galera15GcsActionSource7processEPvRb+0x63)[0x7f21bd5baed3]
/usr/lib64/libgalera_smm.so(_ZN6galera13ReplicatorSMM10async_recvEPv+0x73)[0x7f21bd5df8a3]
/usr/lib64/libgalera_smm.so(galera_recv+0x24)[0x7f21bd5efe14]
/usr/sbin/mysqld[0x5a21f1]
/usr/sbin/mysqld(start_wsrep_THD+0x36e)[0x58b83e]
/lib64/libpthread.so.0(+0x7aa1)[0x7f21c13fdaa1]
/lib64/libc.so.6(clone+0x6d)[0x7f21bf6f4aad]
Trying to get some variables.
Some pointers may be invalid and cause the dump to abort.
Query (0): is an invalid pointer
Connection ID (thread ID): 2
Status: NOT_KILLED
You may download the Percona XtraDB Cluster operations manual by visiting http://www.percona.com/software/percona-xtradb-cluster/. You may find information
in the manual which will help you identify the cause of the crash.
161002 13:13:28 mysqld_safe Number of processes running now: 0
161002 13:13:28 mysqld_safe WSREP: not restarting wsrep node automatically
161002 13:13:28 mysqld_safe mysqld from pid file /tmp/mysql.pid ended
I can confirm the donor crashes when wsrep_desync is set during SST session, and later is tried to be turned off.
[7cf5e9886947] {root} ((none)) > select @@version, @@version_ comment\ G ******* ******* ****** 1. row ******* ******* ******* ******
*******
@@version: 5.6.30-76.3-56-log
@@version_comment: Percona XtraDB Cluster (GPL), Release rel76.3, Revision 64742ac, WSREP version 25.16, wsrep_25.16
1 row in set (0.00 sec)
First, donor is chosen to serve full SST, and during the SST was in progress, on donor the explicit desync was executed:
[7cf5e9886947] {root} ((none)) > set global wsrep_desync = ON;
Query OK, 0 rows affected (0.00 sec)
Log from donor:
2016-10-02 13:10:47 10465 [Note] WSREP: Member 1.0 (node3) requested state transfer from 'node1'. Selected 0.0 (node1)(SYNCED) as donor. sst_xtrabackup- v2 --role 'donor' --address '172.17. 0.3:4444/ xtrabackup_ sst//1' --socket '/var/run/ mysqld/ mysqld. sock' --datadir '/var/lib/mysql/' --defaults-file '/etc/my.cnf' --defaults- group-suffix '' --binlog 'mysql-binlog' --gtid '3e6fc59a- 75a5-11e6- 8d27-c6583ca1a8 44:26066' ' 17.0.3: 4444; RC=( ${PIPESTATUS[@]} ) (20161002 13:10:47.448) 0.0.0.0: 4567') turning message relay requesting off file=/etc/ my.cnf --defaults- group=mysqld --no-version-check $tmpopts $INNOEXTRA --galera-info --stream=$sfmt $itmpdir 2>${DATA} /innobackup. backup. log | socat -u stdio TCP:172. 17.0.3: 4444; RC=( ${PIPESTATUS[@]} ) (20161002 13:10:57.460)
2016-10-02 13:10:47 10465 [Note] WSREP: Shifting SYNCED -> DONOR/DESYNCED (TO: 26066)
2016-10-02 13:10:47 10465 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.
2016-10-02 13:10:47 10465 [Note] WSREP: Running: 'wsrep_
2016-10-02 13:10:47 10465 [Note] WSREP: sst_donor_thread signaled with 0
WSREP_SST: [INFO] Streaming with xbstream (20161002 13:10:47.426)
WSREP_SST: [INFO] Using socat as streamer (20161002 13:10:47.428)
WSREP_SST: [INFO] Using /tmp/tmp.85qn89WXgV as xtrabackup temporary directory (20161002 13:10:47.441)
WSREP_SST: [INFO] Using /tmp/tmp.vOdDwTkzdD as innobackupex temporary directory (20161002 13:10:47.443)
WSREP_SST: [INFO] Streaming GTID file before SST (20161002 13:10:47.447)
WSREP_SST: [INFO] Evaluating xbstream -c ${INFO_FILE} | socat -u stdio TCP:172.
WSREP_SST: [INFO] Sleeping before data transfer for SST (20161002 13:10:47.455)
2016-10-02 13:10:49 10465 [Note] WSREP: (864f40f1, 'tcp://
WSREP_SST: [INFO] Streaming the backup to joiner at 172.17.0.3 4444 (20161002 13:10:57.457)
WSREP_SST: [INFO] Evaluating innobackupex --defaults-
2016-10-02 13:11:08 10465 [Note] WSREP: 0.0 (node1): State transfer to 1.0 (node3) complete.
2016-10-02 13:11:08 10465 [Note] WSREP: Shifting DONOR/DESYNCED -> JOINED (TO: 26066)
2016-10-02 13:11:08 10465 [Note] WSREP: SYNC message ignored as node 0.0 (node1) was re-transitioned to DONOR mode before it synced.
WSREP_SST: [INFO] Total time on donor: 0 seconds (20161002 13:11:08.717)
WSREP_SST: [INFO] Cleaning up temporary directories (20161002 13:11:08.719)
2016-10-02 13:11:43 10465 [Note] WSREP: 1.0 (node3): State transfer from 0.0 (node1) complete.
2016-10-02 13:11:43 10465 [Note] WSREP: Member 1.0 (node3) synced with group.
After that, donor state:
[7cf5e9886947] {root} ((none)) > show status like 'wsrep_ local_state_ comment' ; ------- ------- ------- +------ --+ ------- ------- ------- +------ --+ state_comment | Joined | ------- ------- ------- +------ --+
+------
| Variable_name | Value |
+------
| wsrep_local_
+------
1 row in set (0.00 sec)
So when tried to again explicitly disable desync:
[7cf5e9886947] {root} ((none)) > set global wsrep_desync = 0;
Query OK, 0 rows affected (0.00 sec)
[7cf5e9886947] {root} ((none)) > show status like 'wsrep_ local_state_ comment' ; mysqld/ mysqld. sock' (111)
ERROR 2006 (HY000): MySQL server has gone away
No connection. Trying to reconnect...
ERROR 2002 (HY000): Can't connect to local MySQL server through socket '/var/run/
ERROR:
Can't connect to the server
Further donor's log:
2016-10-02 13:13:28 10465 [Note] WSREP: 0.0 (node1): State transfer to 1.0 (node3) complete. /bugs.launchpad .net/percona- xtradb- cluster
2016-10-02 13:13:28 10465 [ERROR] WSREP: FSM: no such a transition JOINED -> JOINED
13:13:28 UTC - mysqld got signal 6 ;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.
We will try our best to scrape up some info that will hopefully help
diagnose the problem, but since we have already crashed,
something is definitely wrong and this may fail.
Please help us make Percona XtraDB Cluster better by reporting any
bugs at https:/
key_buffer_ size=25165824 size=131072 connections= 2 size)*max_ threads = 105041 K bytes of memory
read_buffer_
max_used_
max_threads=202
thread_count=3
connection_count=1
It is possible that mysqld could use up to
key_buffer_size + (read_buffer_size + sort_buffer_
Hope that's ok; if not, decrease some variables in the equation.
Thread pointer: 0x7f219c000990 mysqld( my_print_ stacktrace+ 0x35)[0x907295] mysqld( handle_ fatal_signal+ 0x4b4)[ 0x66b074] libpthread. so.0(+0xf7e0) [0x7f21c14057e0 ] libc.so. 6(gsignal+ 0x35)[0x7f21bf6 3e5e5] libc.so. 6(abort+ 0x175)[ 0x7f21bf63fdc5] libgalera_ smm.so( _ZN6galera3FSMI NS_10Replicator 5StateENS_ 13ReplicatorSMM 10TransitionENS _10EmptyGuardEN S3_11StateActio nEE8shift_ toES2_+ 0x212)[ 0x7f21bd5e4302] libgalera_ smm.so( _ZN6galera13Rep licatorSMM12pro cess_joinEll+ 0x178)[ 0x7f21bd5df478] libgalera_ smm.so( _ZN6galera15Gcs ActionSource8di spatchEPvRK10gc s_actionRb+ 0x21d)[ 0x7f21bd5ba87d] libgalera_ smm.so( _ZN6galera15Gcs ActionSource7pr ocessEPvRb+ 0x63)[0x7f21bd5 baed3] libgalera_ smm.so( _ZN6galera13Rep licatorSMM10asy nc_recvEPv+ 0x73)[0x7f21bd5 df8a3] libgalera_ smm.so( galera_ recv+0x24) [0x7f21bd5efe14 ] mysqld[ 0x5a21f1] mysqld( start_wsrep_ THD+0x36e) [0x58b83e] libpthread. so.0(+0x7aa1) [0x7f21c13fdaa1 ] libc.so. 6(clone+ 0x6d)[0x7f21bf6 f4aad]
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
stack_bottom = 7f21c16bb998 thread_stack 0x40000
/usr/sbin/
/usr/sbin/
/lib64/
/lib64/
/lib64/
/usr/lib64/
/usr/lib64/
/usr/lib64/
/usr/lib64/
/usr/lib64/
/usr/lib64/
/usr/sbin/
/usr/sbin/
/lib64/
/lib64/
Trying to get some variables.
Some pointers may be invalid and cause the dump to abort.
Query (0): is an invalid pointer
Connection ID (thread ID): 2
Status: NOT_KILLED
You may download the Percona XtraDB Cluster operations manual by visiting www.percona. com/software/ percona- xtradb- cluster/. You may find information
http://
in the manual which will help you identify the cause of the crash.
161002 13:13:28 mysqld_safe Number of processes running now: 0
161002 13:13:28 mysqld_safe WSREP: not restarting wsrep node automatically
161002 13:13:28 mysqld_safe mysqld from pid file /tmp/mysql.pid ended