segfault during shutdown
Affects | Status | Importance | Assigned to | Milestone | ||
---|---|---|---|---|---|---|
MySQL patches by Codership |
New
|
Undecided
|
Unassigned | |||
5.5 |
New
|
Undecided
|
Unassigned | |||
Percona XtraDB Cluster moved to https://jira.percona.com/projects/PXC | Status tracked in 5.6 | |||||
5.5 |
Incomplete
|
Undecided
|
Unassigned | |||
5.6 |
Incomplete
|
Undecided
|
Unassigned |
Bug Description
Server crashed during graceful shutdown under load:
120405 12:23:06 [Note] WSREP: New cluster view: global state: 49ee59de-
120405 12:23:06 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.
120405 12:23:06 [Note] WSREP: applier thread exiting (code:0)
120405 12:23:06 [Note] WSREP: closing applier 3
120405 12:23:06 [Note] WSREP: recv_thread() joined.
120405 12:23:06 [Note] WSREP: Closing slave action queue.
120405 12:23:06 [Note] WSREP: closing connection 788
120405 12:23:06 [Note] WSREP: closing connection 781
120405 12:23:06 [Note] WSREP: Deadlock error for: (null)
120405 12:23:06 [Note] WSREP: BF aborted, thd: 781 is_AC: 0, retry: 0 - 1 SQL: (null)
09:23:06 UTC - mysqld got signal 11 ;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.
We will try our best to scrape up some info that will hopefully help
diagnose the problem, but since we have already crashed,
something is definitely wrong and this may fail.
key_buffer_
read_buffer_
max_used_
max_threads=1024
thread_count=9
connection_count=9
It is possible that mysqld could use up to
key_buffer_size + (read_buffer_size + sort_buffer_
Hope that's ok; if not, decrease some variables in the equation.
Thread pointer: 0x0
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
120405 12:23:06 [Note] WSREP: Before Lock_thread_count
120405 12:23:06 [Note] WSREP: applier thread exiting (code:5)
120405 12:23:06 [Note] WSREP: applier thread exiting (code:5)
120405 12:23:06 [Note] WSREP: applier thread exiting (code:5)
stack_bottom = 0 thread_stack 0x40000
/run/shm/
/run/shm/
/lib/x86_
/lib/x86_
/run/shm/
/run/shm/
/run/shm/
/run/shm/
/run/shm/
/lib/x86_
/lib/x86_
The manual page at http://
information that should help you find out what is causing the crash.
Writing a core file
Backtrace:
(gdb) bt
#0 __pthread_kill (threadid=
#1 0x0000000000698dc9 in handle_fatal_signal (sig=11)
at /home/teemu/
#2 <signal handler called>
#3 __pthread_
#4 __pthread_
#5 0x0000000000515541 in inline_
at /home/teemu/
#6 wsrep_close_thread (thd=0x7f7a7c78
at /home/teemu/
#7 0x0000000000516dfa in wsrep_close_
at /home/teemu/
#8 0x0000000000653803 in wsrep_stop_
at /home/teemu/
#9 0x0000000000518309 in kill_server (sig_ptr=0x0)
at /home/teemu/
#10 0x00000000005184ce in kill_server_thread (arg=<optimized out>)
at /home/teemu/
#11 0x00007f7a8f7ebefc in start_thread (arg=0x7f7a7169
#12 0x00007f7a8f52659d in clone () at ../sysdeps/
#13 0x0000000000000000 in ?? ()
(gdb) f 6
#6 wsrep_close_thread (thd=0x7f7a7c78
at /home/teemu/
4443 mysql_mutex_
(gdb) p thd->mysys_
$4 = (mysql_mutex_t * volatile) 0x0
no longer affects: | codership-mysql/5.5 |
Changed in codership-mysql: | |
milestone: | 5.5.33-23.7.6 → none |
Seems to be still there as of r3840. Reproduced with heavy CPU-contending load (total 127 threads at the moment of crash).
Offending code: var->current_ cond) mutex_lock( thd->mysys_ var->current_ mutex); cond_broadcast( thd->mysys_ var->current_ cond); mutex_unlock( thd->mysys_ var->current_ mutex);
if (thd->mysys_
{
mysql_
mysql_
mysql_
}
by the time mysql_mutex_ unlock( ) is called both current_cond and current_mutex are 0x0.
Out of 8 wsrep slave threads 1 is applying writeset and other 7 are waiting fro commit.
Thread 2 (Thread 0x7fde783df700 (LWP 17210)): cond_wait@ @GLIBC_ 2.3.2 () at ../nptl/ sysdeps/ unix/sysv/ linux/x86_ 64/pthread_ cond_wait. S:162 src/gu_ lock.hpp: 56 :Monitor< galera: :ReplicatorSMM: :CommitOrder> ::enter (this=this@ entry=0x295ed58 , obj=...) at galera/ src/monitor. hpp:126 :ReplicatorSMM: :apply_ trx (this=this@ entry=0x295e2c0 , recv_ctx= recv_ctx@ entry=0x7fde400 00990, trx=trx@ entry=0x7fde400 59990) at galera/ src/replicator_ smm.cpp: 471
#0 pthread_
#1 0x00007fde87b82fd5 in wait (cond=..., this=<optimized out>) at galerautils/
#2 galera:
#3 0x00007fde87b7c9d2 in galera:
Thread 85 (Thread 0x7fde78524700 (LWP 17205)): cond_wait@ @GLIBC_ 2.3.2 () at ../nptl/ sysdeps/ unix/sysv/ linux/x86_ 64/pthread_ cond_wait. S:162 src/gu_ lock.hpp: 56 :Monitor< galera: :ReplicatorSMM: :CommitOrder> ::enter (this=this@ entry=0x295ed58 , obj=...) at galera/ src/monitor. hpp:126 :ReplicatorSMM: :apply_ trx (this=this@ entry=0x295e2c0 , recv_ctx= recv_ctx@ entry=0x7fde300 00990, trx=trx@ entry=0x7fde300 2c790) at galera/ src/replicator_ smm.cpp: 471
#0 pthread_
#1 0x00007fde87b82fd5 in wait (cond=..., this=<optimized out>) at galerautils/
#2 galera:
#3 0x00007fde87b7c9d2 in galera:
Thread 95 (Thread 0x7fde78461700 (LWP 17208)): cond_wait@ @GLIBC_ 2.3.2 () at ../nptl/ sysdeps/ unix/sysv/ linux/x86_ 64/pthread_ cond_wait. S:162 src/gu_ lock.hpp: 56 :Monitor< galera: :ReplicatorSMM: :CommitOrder> ::enter (this=this@ entry=0x295ed58 , obj=...) at galera/ src/monitor. hpp:126 :ReplicatorSMM: :apply_ trx (this=this@ entry=0x295e2c0 , recv_ctx= recv_ctx@ entry=0x7fde480 00990, trx=trx@ entry=0x7fde482 24520) at galera/ src/replicator_ smm.cpp: 471
#0 pthread_
#1 0x00007fde87b82fd5 in wait (cond=..., this=<optimized out>) at galerautils/
#2 galera:
#3 0x00007fde87b7c9d2 in galera:
Thread 97 (Thread 0x7fde784a2700 (LWP 17207)): cond_wait@ @GLIBC_ 2.3.2 () at ../nptl/ sysdeps/ unix/sysv/ linux/x86_ 64/pthread_ cond_wait. S:162 src/gu_ lock.hpp: 56 :Monitor< galera: :ReplicatorSMM: :CommitOrder> ::enter (this=this@ entry=0x295ed58 , obj=...) at galera/ src/monitor. hpp:126 :ReplicatorSMM: :apply_ trx (this=this@ entry=0x295e2c0 , recv_ctx= recv_ctx@ entry=0x7fde2c0 00990, trx=trx@ entry=0x7fde2c3 d85a0) at galera/ src/replicator_ smm.cpp: 471
#0 pthread_
#1 0x00007fde87b82fd5 in wait (cond=..., this=<optimized out>) at galerautils/
#2 galera:
#3 0x00007fde87b7c9d2 in galera:
Thread 99 (Thread 0x7fde78565700 (LWP 17204)): cond_wait@ @GLIBC_ 2.3.2 () at ../nptl/ sysdeps/ unix/sysv/ linux/x86_ 64/pthread_ cond_wait. S:162 src/gu_ lock.hpp: 56 :Monitor< galera: :ReplicatorSMM: :CommitOrder> ::enter (this=this@ entry=0x295ed58 , obj=...) at galera/src/mon...
#0 pthread_
#1 0x00007fde87b82fd5 in wait (cond=..., this=<optimized out>) at galerautils/
#2 galera: