1. use two nodes connected with MySQL native replication,
* configure InnoDB to use row format COMPACT
* configure both nodes to use binlog_format=ROW
2. use at least 3 tables having cyclic ON DELETE CASCADE foreign key constraints
parent -> child1
parent -> child2
child1-> child2
3. run test load at MySQL master to populate the tables with sufficient rows and
then start deleting rows from all tables through several conflicting sessions
4. in the slave node issue sequences of:
STOP SLAVE
ALTER TABLE child2 ROW_FORMAT=COMPRESSED
START SLAVE
STOP SLAVE
ALTER TABLE child2 ROW_FORMAT=COMPACT
START SLAVE
...
=> slave node should crash occasionally. I was able to reproduce with 5.5.29 based build, but not anymore with 5.5.31 RC build
Problem seems to relate to the lifetime of clustered record. When cascaded delete progresses, the state of target record gets altered and wsrep key population from the record can happen too late. I will merge the fix from 5.5.29 tree because 5.5.31 may also be vulnerable, although my testing did surface this.
I was able to reproduce this problem while troubleshooting: https:/ /mariadb. atlassian. net/browse/ MDEV-4624
The full scenario is as follows:
1. use two nodes connected with MySQL native replication,
* configure InnoDB to use row format COMPACT
* configure both nodes to use binlog_format=ROW
2. use at least 3 tables having cyclic ON DELETE CASCADE foreign key constraints
parent -> child1
parent -> child2
child1-> child2
3. run test load at MySQL master to populate the tables with sufficient rows and
then start deleting rows from all tables through several conflicting sessions
4. in the slave node issue sequences of: COMPRESSED
STOP SLAVE
ALTER TABLE child2 ROW_FORMAT=
START SLAVE
STOP SLAVE
ALTER TABLE child2 ROW_FORMAT=COMPACT
START SLAVE
...
=> slave node should crash occasionally. I was able to reproduce with 5.5.29 based build, but not anymore with 5.5.31 RC build
Problem seems to relate to the lifetime of clustered record. When cascaded delete progresses, the state of target record gets altered and wsrep key population from the record can happen too late. I will merge the fix from 5.5.29 tree because 5.5.31 may also be vulnerable, although my testing did surface this.