More migrations with constant load
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
The Ubuntu-power-systems project |
Fix Released
|
High
|
Canonical Kernel Team | ||
linux (Ubuntu) |
Fix Released
|
High
|
Joseph Salisbury | ||
Zesty |
Fix Released
|
High
|
Joseph Salisbury |
Bug Description
== SRU Justification ==
There is a significantly higher number of task migrations when the load is
fixed and not balanced across cores.
Benchmark results are posted in the bug description and in the commits git log.
This bug is resolved by mainline commit 05b40e057734811
in mailine as of 4.12-rc1.
== Fix ==
commit 05b40e057734811
Author: Srikar Dronamraju <email address hidden>
Date: Wed Mar 22 23:27:50 2017 +0530
sched/fair: Prefer sibiling only if local group is under-utilized
== Regression Potential ==
Medium, since this commit does touch the scheduler. However, the commit only makes a change to
allow a local group to pull a task, if the source group has more number of
tasks than the local group.
== Test Case ==
A test kernel was built with this patch and tested by the original bug reporter.
The bug reporter states the test kernel resolved the bug.
== Comment: #0 - PUVICHAKRAVARTHY RAMACHANDRAN - 2017-08-06 13:44:45 ==
---Problem Description---
Significantly higher number of task migrations when the load is fixed but not balanced across cores.
---uname output---
Linux isvbos3 4.10.0-29-generic #33~16.04.1-Ubuntu SMP Tue Jul 25 18:17:06 UTC 2017 ppc64le ppc64le ppc64le GNU/Linux
---Additional Hardware Info---
Power9 dd2.0
Machine Type = Power9
---Steps to Reproduce---
Benchmark : Multithreaded - cpu intensive. The system had 2 socket/ 32 cores/ SMT4 mode.
When 64 threads was run - the migrations were less over 10s interval.
when 80 threads were run - the migrations were very high.
Ideally, it should have been very minimal, as the over all load was constant
== Comment: #3 - SRIKAR DRONAMRAJU - 2017-08-11 06:56:47 ==
As suspected (commit : 05b40e0577 : "sched/fair: Prefer sibiling only if local group is under-utilized")
https:/
should fix the problem
Ran ' perf stat -a -r 5 -e sched:sched_
to detect the problem and verify the fix
Here is perf stat without fix.
Performance counter stats for 'system wide' (5 runs):
7,758 sched:sched_
100.015658079 seconds time elapsed ( +- 0.00% )
perf stat with fix.
Performance counter stats for 'system wide' (5 runs):
415 sched:sched_
100.016021787 seconds time elapsed ( +- 0.00% )
git describe on upstream kernel says v4.11-rc2
# git describe 05b40e0577
v4.11-rc2-
== Comment: #4 - SRIKAR DRONAMRAJU - 2017-08-11 07:05:37 ==
Attaching the patch that needs to be applied to fix this bug.
Verified that patch fixes the problem.
CVE References
Changed in ubuntu-power-systems: | |
importance: | Undecided → High |
assignee: | nobody → Canonical Kernel Team (canonical-kernel-team) |
Changed in linux (Ubuntu): | |
importance: | Undecided → High |
status: | New → Triaged |
tags: | added: kernel-da-key |
Changed in ubuntu-power-systems: | |
status: | New → Triaged |
Changed in ubuntu-power-systems: | |
status: | Triaged → In Progress |
tags: | added: triage-g |
description: | updated |
Changed in linux (Ubuntu Zesty): | |
status: | In Progress → Fix Committed |
Changed in ubuntu-power-systems: | |
status: | In Progress → Fix Committed |
Changed in ubuntu-power-systems: | |
status: | Fix Committed → Fix Released |
Changed in linux (Ubuntu): | |
status: | In Progress → Fix Released |
tags: | added: cscc |
Default Comment by Bridge