Secondary node crashing under heavily concurrent transactional stress
Bug #861901 reported by
Patrick Crews
This bug affects 1 person
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
MySQL patches by Codership |
Fix Released
|
Critical
|
Seppo Jaakola | ||
5.1 |
Fix Committed
|
Critical
|
Seppo Jaakola | ||
5.5 |
Fix Released
|
Critical
|
Seppo Jaakola |
Bug Description
When issuing the following randgen workload against a 2-node setup:
./gentest.pl --gendata=
It was found that the second node is crashing. This test is designed so that each thread produces 700 transactions of varying length and validity.
There is a modified randgen branch that was used for this test:
bzr branch lp:~patrick-crews/randgen/randgen_codership
Changed in codership-mysql: | |
assignee: | nobody → Seppo Jaakola (seppo-jaakola) |
importance: | Undecided → Critical |
milestone: | none → 21.1 |
Changed in codership-mysql: | |
status: | New → Confirmed |
To post a comment you must log in.
there is error log from 2nd node
110928 18:39:06 [ERROR] Slave SQL: Could not execute Update_rows event on table test.DD; Can't find record in 'DD', Error_code: 1032; handler error HA_ERR_ KEY_NOT_ FOUND; the event's master log FIRST, end_log_pos 981, Error_code: 1032 ea21-11e0- 0800-35fa1bfd4d 40 version: 1 local: 0 state: CERTIFYING flags: 1 conn_id: 105 trx_id: 1136961 seqnos (l: 9097, g: 126062, s: 126061, d: 126055, ts: 131724923096825 6296) src/replicator_ smm.cpp: apply_data( ):77 src/replicator_ smm.cpp: apply_trx_ ws():184 31bd488e- ea22-11e0- 0800-a819c51eec 43, LEAVING, view_id( REG,31bd488e- ea22-11e0- 0800-a819c51eec 43,2)) uuid 608bc31a- ea21-11e0- 0800-35fa1bfd4d 40 missing from install message, assuming partitioned :handle_ stable_ view: view(view_ id(NON_ PRIM,31bd488e- ea22-11e0- 0800-a819c51eec 43,2) memb {
31bd488e- ea22-11e0- 0800-a819c51eec 43,
608bc31a- ea21-11e0- 0800-35fa1bfd4d 40, :handle_ stable_ view: view((empty))
110928 18:39:06 [Warning] WSREP: RBR event 2 apply warning: 120, 126062
110928 18:39:06 [ERROR] WSREP: Failed to apply trx: source: 608bc31a-
110928 18:39:06 [ERROR] WSREP: Failed to apply app buffer: Π�N
, seqno: 126062, status: WSREP_FATAL
at galera/
at galera/
110928 18:39:06 [ERROR] WSREP: Node consistency compromized, aborting...
110928 18:39:06 [Note] WSREP: Closing send monitor...
110928 18:39:06 [Note] WSREP: Closed send monitor.
110928 18:39:06 [Note] WSREP: gcomm: terminating thread
110928 18:39:06 [Note] WSREP: gcomm: joining thread
110928 18:39:06 [Note] WSREP: gcomm: closing backend
110928 18:39:06 [Note] WSREP: evs::proto(
110928 18:39:06 [Note] WSREP: GMCast:
} joined {
} left {
} partitioned {
})
110928 18:39:06 [Note] WSREP: GMCast:
110928 18:39:06 [Note] WSREP: New COMPONENT: primary = no, my_idx = 0, memb_num = 1
110928 18:39:06 [Note] WSREP: gcomm: closed
110928 18:39:06 [Note] WSREP: Flow-control interval: [8, 16]
110928 18:39:06 [Note] WSREP: Received NON-PRIMARY.
110928 18:39:06 [Note] WSREP: Shifting SYNCED -> OPEN (TO: 126062)
110928 18:39:06 [Note] WSREP: Received self-leave message.
110928 18:39:06 [Note] WSREP: Flow-control interval: [0, 0]
110928 18:39:06 [Note] WSREP: Received SELF-LEAVE. Closing connection.
110928 18:39:06 [Note] WSREP: Shifting OPEN -> CLOSED (TO: 126062)
110928 18:39:06 [Note] WSREP: RECV thread exiting 0: Success
110928 18:39:06 [Note] WSREP: recv_thread() joined.
110928 18:39:06 [Note] WSREP: Closing slave action queue.
110928 18:39:06 [Note] WSREP: bin/mysqld: Terminated.