Inconsistency and connection deadlocks with cross-node record updates
Affects | Status | Importance | Assigned to | Milestone | ||
---|---|---|---|---|---|---|
Percona XtraDB Cluster moved to https://jira.percona.com/projects/PXC | Status tracked in 5.6 | |||||
5.6 |
Confirmed
|
Undecided
|
Kenn Takara | |||
5.7 |
Fix Released
|
Undecided
|
Kenn Takara |
Bug Description
My sequence of events is basically identical to this blog post (Transaction with SELECT FOR UPDATE, math performed on a record, then record updated and committed), because this is directly from CoderShip, I am assuming this is intended to be supported:
http://
However, what is being experienced is
1) Data inconsistency, dirty reads are occurring, so the calculation for the update is wrong, and the duplicate updates from different nodes aren't causing deadlocks so we end up with inconsistency due to these lost updates.
2) Connection lockup occurs, where the only way to unlock the client is the restart the DB node(s) for the locked connections. When performing a
"SHOW PROCESSLIST;" it shows all connections from the application are in a
Sleep state, however they did NOT receive responses.
The server version being used is Percona-
I have attached a test case that reproduces this issue consistently. This same test case works fine if pointing to only a single DB node in the cluster.
Config settings:
/etc/my.cnf:
[mysqld]
datadir = /var/lib/mysql
# move tmpdir due to /tmp being a memory backed tmpfs filesystem, mysql uses this for on disk sorting
tmpdir = /var/lib/mysql/tmp
[mysqld_safe]
pid-file = /run/mysqld/
syslog
!includedir /etc/my.cnf.d
/etc/my.
[mysqld]
bind-address = 0.0.0.0
key_buffer = 256M
max_allowed_packet = 16M
max_connections = 256
# Some optimizations
thread_concurrency = 10
sort_buffer_size = 2M
query_cache_limit = 100M
query_cache_size = 256M
log_bin
binlog_format = ROW
gtid_mode = ON
log_slave_updates
enforce_
group_concat_
innodb_
innodb_
innodb_
innodb_file_format = barracuda
default_
# SSD Tuning
innodb_
innodb_io_capacity = 6000
/etc/my.
# Galera cluster
[mysqld]
wsrep_provider = /usr/lib64/
wsrep_sst_method = xtrabackup-v2
wsrep_sst_auth = "sstuser:
wsrep_cluster_name = cluster
wsrep_slave_threads = 32
wsrep_max_ws_size = 2G
wsrep_provider_
wsrep_cluster_
wsrep_sync_wait = 0
innodb_
innodb_
innodb_
sync_binlog = 0
innodb_support_xa = 0
innodb_flush_method = ALL_O_DIRECT
[sst]
progress = 1
time = 1
streamfmt = xbstream
description: | updated |
description: | updated |
Changed in percona-xtradb-cluster: | |
assignee: | nobody → Kenn Takara (kenn-takara) |
bump.
This seems like a fairly serious issue to me, and provided a test case to reproduce. Surprised this hasn't had any movement.