Lowest group communication layer (evs) fails to handle the situation properly when big number of nodes suddenly start to see each other
Affects | Status | Importance | Assigned to | Milestone | ||
---|---|---|---|---|---|---|
Galera | Status tracked in 3.x | |||||
2.x |
Fix Committed
|
Undecided
|
Unassigned | |||
3.x |
Fix Committed
|
Undecided
|
Unassigned | |||
Percona XtraDB Cluster moved to https://jira.percona.com/projects/PXC | Status tracked in 5.6 | |||||
5.5 |
Fix Released
|
Medium
|
Unassigned | |||
5.6 |
Fix Released
|
Medium
|
Unassigned |
Bug Description
We have a 9 node cluster. Suddenly they stop to see each other:
140122 9:57:38 [Note] WSREP: view(view_
} joined {
} left {
} partitioned {
})
Later on the problem is solved but they can't reconnect:
140122 9:58:38 [Note] WSREP: New COMPONENT: primary = no, bootstrap = no, my_idx = 0, memb_num = 1
140122 9:58:38 [Note] WSREP: Flow-control interval: [16, 16]
140122 9:58:38 [Note] WSREP: Received NON-PRIMARY.
140122 9:58:38 [Note] WSREP: New cluster view: global state: 840ae537-
140122 9:58:38 [Warning] WSREP: evs::proto(
140122 9:58:39 [Warning] WSREP: evs::proto(
140122 9:58:40 [Warning] WSREP: evs::proto(
140122 9:58:41 [Warning] WSREP: evs::proto(
140122 9:58:42 [Warning] WSREP: evs::proto(
Similar messages on all nodes.
Miguel, what is the Galera version? Looks similar, at least in behavior to https:/ /bugs.launchpad .net/percona- xtradb- cluster/ +bug/1269236