No action is taken when wsrep recv thread returns with a fatal error
Bug #428663 reported by
Alex Yurchenko
This bug affects 1 person
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
MySQL patches by Codership |
Fix Released
|
High
|
Seppo Jaakola | ||
Trunk |
Fix Released
|
High
|
Seppo Jaakola |
Bug Description
This is generally a fatal condition indicating loss of connectivity to the cluster and even in case of good connectivity it means inability to certify and apply slave writesets.
However currently mysqld stays fully operational: it accepts connections and transactions which result in errors returned at commit time. This is rather confusing both for the user and for any automated connection balancer as everything works fine until you hit COMMIT button.
Proposal: mysqld should shut down right away.
Related branches
To post a comment you must log in.
This was fixed by calling kill_mysql() in wsrep_replicati on_process( ), if wsrep->recv() returns with error.
The fix was tested with 3 node cluster, where one node was disconnected from vsbes by taking the NIC down (ifdown eth1). There was constant sqlgen load against the cluster during this test.
The node run in debug mode, and following log messages show how disconnect was detected and resulted in shutdown:
091004 12:31:42 [Note] DEBUG: mm_galera. c:mm_galera_ recv(): 1265: worker: 0 with seqno: (-1 - 32514) type: GCS_ACT_COMMIT_CUT recvd
:recv_nointr(): Return 113 (No route to host) in header recv backend. cpp:handle_ up():27: VSRBackend: :handle_ up(): Transport failed cpp:conn_ run():359: poll error: 'broken backend connection', thread exiting c:core_ msg_recv( ):407: returning -107: Transport endpoint is not connected
091004 12:47:17 [ERROR] vs_remote_
091004 12:47:17 [ERROR] gcs_vs.
091004 12:47:17 [Note] DEBUG: gcs_core.
091004 12:47:17 [Note] DEBUG: gcs.c:gcs_ recv_thread( ):471: gcs_core_recv returned -107: Transport endpoint is not connected recv_thread( ):573: RECV thread exiting -107: Transport endpoint is not connected c:mm_galera_ recv(): 1257: gcs_recv() returned 0 (Success) mysql-5. 1.38-2894/ mysql/libexec/ mysqld: Normal shutdown
091004 12:47:17 [Note] gcs.c:gcs_
091004 12:47:17 [ERROR] mm_galera.
091004 12:47:17 [ERROR] wsrep recv thread exiting with status: 5
091004 12:47:17 [ERROR] starting shutdown
091004 12:47:17 [Note] Got signal 15 to shutdown mysqld
091004 12:47:17 [Note] /home/galera/
091004 12:47:17 [Note] Before Lock_thread_count backend. cpp:leave( ):150: VSRBackend: :leave( ): (3,0,1) cpp:gcs_ vs_destroy( ):412: received: 32782, copied: 2020 cpp:gcs_ vs_destroy( ):417: gcs_vs_close(): return 0 close() :655: recv_thread() joined. c:mm_galera_ pre_commit( ):1680: gcs failed for: 176029, len: 1592, rcode: -4 c:mm_galera_ pre_commit( ):1680: gcs failed for: 176060, len: 1212, rcode: -4 c:mm_galera_ pre_commit( ):1680: gcs failed for: 176065, len: 448, rcode: -4 close() :682: Closing slave action queue. c:mm_galera_ disconnect( ):405: Closed GCS connection
091004 12:47:17 [Note] After lock_thread_count
091004 12:47:17 [Warning] WSREP rollback thread wakes for signal
091004 12:47:17 [Note] Event Scheduler: Purging the queue. 0 events
091004 12:47:17 [Warning] WSREP rollback thread has empty abort queue
091004 12:47:17 [Note] WSREP: rollbacker thread exiting
091004 12:47:17 [Note] wsrep closing connection to cluster
091004 12:47:17 [Note] DEBUG: vs_remote_
091004 12:47:17 [Note] DEBUG: gcs_vs.
091004 12:47:17 [Note] DEBUG: gcs_vs.
091004 12:47:17 [Note] DEBUG: gcs.c:gcs_
091004 12:47:17 [ERROR] mm_galera.
091004 12:47:17 [ERROR] mm_galera.
091004 12:47:17 [ERROR] mm_galera.
091004 12:47:17 [ERROR] WSREP connection failure
091004 12:47:17 [Note] gcs.c:gcs_
091004 12:47:17 [ERROR] WSREP connection failure
091004 12:47:17 [ERROR] WSREP connection failure
091004 12:47:17 [Warning] MySQL is closing a connection that has an active InnoDB transaction. 1 row modifications will roll back.
091004 12:47:17 [ERROR] WSREP connection failure
091004 12:47:17 [Note] mm_galera.
091004 12:47:17 [Warning] MySQL is closing a connection that has an active InnoDB transaction. 4 ro...