RBR Replication with concurrent XA in READ-COMMITTED takes supremum pseudo-records and breaks replication
Affects | Status | Importance | Assigned to | Milestone | ||
---|---|---|---|---|---|---|
Percona Server moved to https://jira.percona.com/projects/PS | Status tracked in 5.7 | |||||
5.5 |
New
|
High
|
Vlad Lesin | |||
5.6 |
New
|
High
|
Vlad Lesin | |||
5.7 |
In Progress
|
High
|
Vlad Lesin |
Bug Description
The problem that we are is facing is similar to what is described in https:/
This is what happens:
There are 2 XA Transactions that deadlock which causes replication to fail with the error:
```
Last_Error: Slave SQL thread retried transaction 10 time(s) in vain, giving up. Consider raising the value of the slave_transacti
```
There are several ongoing XA Transactions at the same time.
When we look at the locks that are being held, we see:
```
mysql> select * from INNODB_LOCKS;
+------
| lock_id | lock_trx_id | lock_mode | lock_type | lock_table | lock_index | lock_space | lock_page | lock_rec | lock_data |
+------
| 28024945201:
| 28024945046:
+------
2 rows in set, 1 warning (0.00 sec)
mysql> select * from INNODB_LOCK_WAITS;
+------
| requesting_trx_id | requested_lock_id | blocking_trx_id | blocking_lock_id |
+------
| 28024945201 | 28024945201:
+------
1 row in set, 1 warning (0.00 sec)
```
There is a S lock on a supremum pseudo-record in one XA transaction which 'conflicts' with the X lock another XA transaction wants to take.
How does this happen?
What happens on the original master is as follows...
Requirements:
- We have the following rows: ...,80000,
- row 80000 is on for example innodb page 99 and row 100000 is on page 100. There is still some space available to add a row to page 99.
- make sure the slave is using log_slave_updates & binlog_format=mixed
You can simulate it like this:
```
CREATE TABLE t1 (t1_pk DECIMAL(20,0) PRIMARY KEY , t1_blob BLOB) ENGINE=InnoDB;
INSERT INTO t1 VALUES (10000, REPEAT("a", 2165));
INSERT INTO t1 VALUES (20000, REPEAT("a", 2165));
INSERT INTO t1 VALUES (30000, REPEAT("a", 2165));
INSERT INTO t1 VALUES (40000, REPEAT("a", 2165));
INSERT INTO t1 VALUES (50000, REPEAT("a", 2165));
INSERT INTO t1 VALUES (60000, REPEAT("a", 2165));
INSERT INTO t1 VALUES (70000, REPEAT("a", 2165));
INSERT INTO t1 VALUES (80000, REPEAT("a", 2165));
INSERT INTO t1 VALUES (90000, REPEAT("a", 2165));
INSERT INTO t1 VALUES (100000, REPEAT("a", 2165));
INSERT INTO t1 VALUES (110000, REPEAT("a", 2165));
INSERT INTO t1 VALUES (120000, REPEAT("a", 2165));
INSERT INTO t1 VALUES (130000, REPEAT("a", 2165));
INSERT INTO t1 VALUES (140000, REPEAT("a", 2165));
INSERT INTO t1 VALUES (150000, REPEAT("a", 2165));
INSERT INTO t1 VALUES (160000, REPEAT("a", 2165));
INSERT INTO t1 VALUES (170000, REPEAT("a", 2165));
INSERT INTO t1 VALUES (180000, REPEAT("a", 2165));
INSERT INTO t1 VALUES (190000, REPEAT("a", 2165));
INSERT INTO t1 VALUES (200000, REPEAT("a", 2165));
INSERT INTO t1 VALUES (210000, REPEAT("a", 2165));
DELETE FROM t1 WHERE t1_pk IN (90000, 80000);
CREATE TABLE t2 (t2_pk INT PRIMARY KEY, t1_pk DECIMAL(20,0),
FOREIGN KEY (t1_pk) REFERENCES t1 (t1_pk)) ENGINE=InnoDB;
--connect(
XA START '2';
INSERT INTO t1 VALUES (85000, NULL);
#-- We are inserting a value between 80000,100000
--connection master
XA START '1';
INSERT INTO t2 VALUES (1, 100000);
#-- This causes an S lock on 100000
XA END '1';
XA PREPARE '1';
--connection con2
XA END '2';
XA PREPARE '2';
--connection con2
XA COMMIT '2';
--connection master
XA COMMIT '1';
```
Now depending on binlog_format and tx_isolation, you will get different results:
binlog_
binlog_
binlog_format=ROW , tx_isolation=
binlog_format=ROW , tx_isolation=READ COMMITTED : The events will be logged in RBR and replication will not break. You will have supremum pseudo-records S locks on the master, but not on the slave
All this is to be expected. (This basically means always use RBR with XA)
I have tried everything to be able to reproduce this issue the customer is facing and am not able to reproduce this yet
The customer is using binlog_format=MIXED with tx_isolation=READ COMMITTED, the events are RBR, so that test case above does not fail, but the workload of the customer which looks similar does fail.
You can see with verbose locks in `SHOW ENGINE INNODB STATUS` that there is an S lock on a row in page 938216, and then an supremum S lock on page 938215 :
RECORD LOCKS space id 8439 page no 938216 n bits 80 index PRIMARY of table `databse`.`tablezz` trx id 28024945046 lock mode S locks rec but not gap
Record lock, heap no 2 PHYSICAL RECORD: n_fields 45; compact format; info bits 0
0: len 13; hex 800000000000000
... REDACTED ...
RECORD LOCKS space id 8439 page no 938215 n bits 112 index PRIMARY of table `databse`.`tablezz` trx id 28024945046 lock mode S
Record lock, heap no 1 PHYSICAL RECORD: n_fields 1; compact format; info bits 0
0: len 8; hex 73757072656d756d; asc supremum;;
```
It must be that somehow replication is putting an S lock on a supremum record, even though the TRX is in READ-COMMITTED. I do not yet know when this happens.
summary: |
RBR Replication with concurrent XA in READ-COMMITTED takes supremum - pseudo-records + pseudo-records and breaks replication |
Changed in percona-server: | |
assignee: | nobody → Vlad Lesin (vlad-lesin) |
Changed in percona-server: | |
status: | New → In Progress |
importance: | Undecided → High |
This is not the complete answer, just a note MySQL documentation says:
"If you use READ COMMITTED, you must use row-based binary logging. "