performance tuning for deadlock detect switch
Affects | Status | Importance | Assigned to | Milestone | ||
---|---|---|---|---|---|---|
Percona Server moved to https://jira.percona.com/projects/PS | Status tracked in 5.7 | |||||
5.1 |
Won't Fix
|
Wishlist
|
Unassigned | |||
5.5 |
Triaged
|
Wishlist
|
Unassigned | |||
5.6 |
Triaged
|
Wishlist
|
Unassigned | |||
5.7 |
Fix Released
|
Wishlist
|
Unassigned |
Bug Description
As for deadlock detect mechanism in Innodb, it's talked for long whether
we need recursive checking for deadlock for some specail scenario, such as
lots of concurrent updates for the same record.
In the Planet MySQL, it's recommended:
“InnoDB is much faster when deadlock detection is disabled for workloads with
a lot of concurrency and contention.”
We are suffering the scenario above, in one of Taobao's core application, Item Center(IC).
Most of the time, it's okay, while for some special sales promotion(about once per month),
it's very very bad, as lots of users of Taobao participated in.
Here is the oprofile result(simulated the online scenario):
2 Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit mask of 0x00 (Unhalted core cycles) count 100000
3 samples % symbol name
4 2008672 84.8036 lock_deadlock_
5 91364 3.8573 lock_has_to_wait
6 11216 0.4735 safe_mutex_lock
7 9719 0.4103 ut_delay
8 8047 0.3397 MYSQLparse(void*)
9 7938 0.3351 lock_rec_
10 7788 0.3288 code_state
11 7601 0.3209 my_strnncoll_binary
12 6703 0.2830 dict_col_
13 6598 0.2786 _db_enter_
14 6451 0.2724 _db_return_
15 5733 0.2420 _db_doprnt_
16 5503 0.2323 rec_get_
17 5325 0.2248 ha_innobase:
18 5241 0.2213 mutex_spin_wait
19 4931 0.2082 build_template(
20 4655 0.1965 lock_rec_
As you can see, it's soo bad for lock_detect_
to disable the deadlock detect dynamically. For IC application, there is almost no
deadlock as business SQL logic is tuned, so there seems no risk yet.
To make the scenario repeatable, a test case is provided with the data we hit (tweak for sensitive columns) later and the related patch. Please help to have a review.
Changed in percona-server: | |
assignee: | nobody → yinfeng (yinfeng-zwx) |
assignee: | yinfeng (yinfeng-zwx) → nobody |
Changed in percona-server: | |
importance: | Undecided → Medium |
importance: | Medium → Wishlist |
tags: | added: contribution |
Changed in percona-server: | |
status: | New → Incomplete |
Changed in percona-server: | |
status: | Expired → New |
tags: | added: xtradb |
tags: | added: performance |
Steps to run:
1. unzip the deadlock.7zip
2. start test with "sh run.sh" to observe the results of perf data.
3. apply the patch and turn the switch off to get the result again:
root@(none) 03:40:21>set global innodb_ deadlock_ detect= off;
Query OK, 0 rows affected (0.00 sec)
root@(none) 03:40:33>show variables like '%detect%'; ------- ------- ----+-- -----+ ------- ------- ----+-- -----+ deadlock_ detect | OFF | ------- ------- ----+-- -----+
+------
| Variable_name | Value |
+------
| innodb_
+------
1 row in set (0.00 sec)
The result we test is summarized as:
1000000 queries, concurrency=16
2124 vs 1971 seconds,
1000000 queries, concurrency=700
33569 vs 2612 seconds
As we can see, the patched switch promoted the perf much for the large concurrency scenario.