main_connection keep an useless transaction opened and permit infinite metadata table lock
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
MySQL Data Dumper |
Fix Released
|
High
|
Max Bubenick |
Bug Description
This bug is opened after a long issue solving process with SkySQL (https:/
The patch is simple (2 lines added, 1 removed).
Context :
========
mydumper.c create a main mysql connection used to scan databases, and tables in order to fillup threads job queue. For that, this main connection does a START TRANSACTION WITH CONSISTENT SNAPSHOST.
When the main process has finished, he wait for the UNLOCK TABLES to be needed (after backing up the non innodb tables).
After that, he wait for the thread to finished. This step can take some hours.
And finally, the main connection is closed.
Bug :
====
if the replication slave process on the sql server, try a truncate/alter table after that the main connection is created, but before each threads has begun to backup this table, so the replication process enter a "Waiting for metadata table lock", and, when the backups thread begin to backup this table, they also enter in "Waiting for metadata table lock".
We can manually solve that by killing the mysql connection now useless (the one created with create_
Solution :
========
As minimal solution, COMMIT just after waiting (if needed) the UNLOCK TABLES;
But, as the main mysql connection is useless after the UNLOCK TABLES and the COMMIT, i decide to close it here, and not after the potential hours of backups.
Thanks for your time, and ready to explain anything needed. (Sorry for my bad english).
Related branches
Changed in mydumper: | |
status: | Triaged → Fix Committed |
Changed in mydumper: | |
milestone: | none → 0.6.1 |
Changed in mydumper: | |
status: | Fix Committed → Fix Released |
You are correct about the main connection keeping a not needed transaction open after releasing FTWRL. We can commit or close it at that point. But this is not the only issue about metadata locking.
The main thread is holding locks only when using row chunks because it needs to access each table to calculate them. In any other case will not keep any.
I will apply your patch as it helps to reduce it, but dumping threads is another thing to look at.
The dumping threads will hold metadata locks progressively while are dumping data.
The workaround for that is using SAVEPOINTs within the dumping transactions, but with multiple threads and multiples chunks being dumped in parallel for the same table is not easy to coordinate, we can end up with an inconsistent backup when a TRUNCATE or DROP is executed.