Crash after SST donation
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
Percona XtraDB Cluster moved to https://jira.percona.com/projects/PXC |
Fix Released
|
Undecided
|
Unassigned |
Bug Description
Hi,
I've encountered a strange issue today which I believe is a bug.
We are running version 5.6.30-76.3-56-log Release rel76.3, Revision aa929cb, WSREP version 25.16.
We've had a node crash today right after it finished giving an SST donation.
I.e node 03 was donating to node 01, it finished successfully and then crashed with a strange error, here the logs.
2016-08-10 05:18:16 15533 [Note] WSREP: Running: 'wsrep_
2016-08-10 05:18:16 15533 [Note] WSREP: sst_donor_thread signaled with 0
WSREP_SST: [INFO] Streaming with xbstream (20160810 05:18:18.470)
WSREP_SST: [INFO] Using socat as streamer (20160810 05:18:18.474)
WSREP_SST: [INFO] Using /tmp/tmp.qJd6p81xmJ as innobackupex temporary directory (20160810 05:18:18.568)
WSREP_SST: [INFO] Streaming GTID file before SST (20160810 05:18:18.578)
WSREP_SST: [INFO] Evaluating xbstream -c ${INFO_FILE} | socat -u stdio TCP:192.
2016-08-10 05:18:18 15533 [Note] WSREP: (35e1aeed, 'ssl://
WSREP_SST: [INFO] Sleeping before data transfer for SST (20160810 05:18:18.682)
WSREP_SST: [INFO] Streaming the backup to joiner at 192.168.1.24 4444 (20160810 05:18:28.690)
WSREP_SST: [INFO] Evaluating innobackupex --defaults-
2016-08-10 06:20:25 15533 [Note] WSREP: Provider paused at 83e07a6f-
2016-08-10 06:25:33 15533 [Note] WSREP: resuming provider at 1081311812
2016-08-10 06:25:33 15533 [Note] WSREP: Provider resumed.
2016-08-10 06:25:33 15533 [Note] WSREP: 0.0 (dbm03): State transfer to 1.0 (dbm01) complete.
2016-08-10 06:25:33 15533 [Note] WSREP: Shifting DONOR/DESYNCED -> JOINED (TO: 36537262238)
WSREP_SST: [INFO] Total time on donor: 0 seconds (20160810 06:25:35.361)
2016-08-10 06:25:35 15533 [Note] WSREP: 0.0 (dbm03): State transfer to 1.0 (dbm01) complete.
WSREP_SST: [INFO] Cleaning up temporary directories (20160810 06:25:35.411)
2016-08-10 06:25:59 15533 [ERROR] WSREP: FSM: no such a transition JOINED -> JOINED
10:25:59 UTC - mysqld got signal 6 ;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.
We will try our best to scrape up some info that will hopefully help
diagnose the problem, but since we have already crashed,
something is definitely wrong and this may fail.
Please help us make Percona XtraDB Cluster better by reporting any
bugs at https:/
key_buffer_
read_buffer_
max_used_
max_threads=1002
thread_count=11
connection_count=0
It is possible that mysqld could use up to
key_buffer_size + (read_buffer_size + sort_buffer_
Hope that's ok; if not, decrease some variables in the equation.
Thread pointer: 0x7fe070000990
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
stack_bottom = 7ffb3c2f59d8 thread_stack 0x40000
/usr/sbin/
/usr/sbin/
/lib64/
/lib64/
/lib64/
/usr/lib64/
/usr/lib64/
/usr/lib64/
/usr/lib64/
/usr/lib64/
/usr/lib64/
/usr/sbin/
/usr/sbin/
/lib64/
/lib64/
Trying to get some variables.
Some pointers may be invalid and cause the dump to abort.
Changed in percona-xtradb-cluster: | |
status: | New → Fix Committed |
Changed in percona-xtradb-cluster: | |
milestone: | none → 5.6.32-25.17 |
status: | Fix Committed → Fix Released |
This has just occurred again, it seems as if the donating node is stuck in 'Joined' state after the donation has already been completed.
I.e : wsrep_local_ state_comment | Joined