Comment 36 for bug 1097213

Revision history for this message
Basil Eljuse (basil-eljuse) wrote :

Hi Naresh,

Thanks for the clarification. This bug is now reproducible to me also on the non-psci firmware.

I ran the 2 tests from your cpuhotplug suite in a 50 iteration sequence as below

./testrunner --run releasetest --n 50 --testcase cpu_hotplug_latency_with_load --suite cpuhotplug_suite --verbose
./testrunner --run releasetest --n 50 --testcase cpu_hotplug_latency_without_load --suite cpuhotplug_suite --verbose

And the test crashed with the following backtrace info. I wonder if this is something to do with the gic irq handler!

[26142.733262] CPU: 4 PID: 0 Comm: swapper/4 Not tainted 3.10.19-55771-g20a9594 #1
 [26142.754821] [<c0011c8d>] (unwind_backtrace+0x1/0x9c) from [<c000fc01>] (show_stack+0x11/0x14)
 [26142.779962] [<c000fc01>] (show_stack+0x11/0x14) from [<c0011047>] (handle_IPI+0xe3/0x1b8)
 [26142.804078] [<c0011047>] (handle_IPI+0xe3/0x1b8) from [<c00084cf>] (gic_handle_irq+0x4b/0x50)
 [26142.829220] [<c00084cf>] (gic_handle_irq+0x4b/0x50) from [<c000c95b>] (__irq_svc+0x3b/0x5c)
 [26142.853841] Exception stack(0xef0cbf18 to 0xef0cbf60)
 [26142.868725] bf00: 00000000 00000018
 [26142.892837] bf20: 278edb1a 00000000 c16ba460 c16ba464 00000001 c0606f00 c06761e8 c0614ac8
 [26142.916949] bf40: 00000004 ef0ca000 5290e099 ef0cbf60 c02e7491 c02d88fa 60000173 ffffffff
 [26142.941071] [<c000c95b>] (__irq_svc+0x3b/0x5c) from [<c02d88fa>] (bl_enter_powerdown+0x4e/0x94)
 [26142.966731] [<c02d88fa>] (bl_enter_powerdown+0x4e/0x94) from [<c02d75c3>] (cpuidle_enter_state+0x2b/0xa8)
 [26142.994953] [<c02d75c3>] (cpuidle_enter_state+0x2b/0xa8) from [<c02d76b9>] (cpuidle_idle_call+0x79/0x140)
 [26143.023173] [<c02d76b9>] (cpuidle_idle_call+0x79/0x140) from [<c000d9d1>] (arch_cpu_idle+0xd/0x28)
 [26143.049605] [<c000d9d1>] (arch_cpu_idle+0xd/0x28) from [<c00502c5>] (cpu_startup_entry+0x5d/0x164)
 [26143.076031] [<c00502c5>] (cpu_startup_entry+0x5d/0x164) from [<800081b5>] (0x800081b5)
 [26143.099372] IPI backtrace for cpu 0
 [26143.109637]
 [26143.114008] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 3.10.19-55771-g20a9594 #1
 [26143.135547] task: c05e0388 ti: c05d4000 task.ti: c05d4000
 [26143.151455] PC is at bl_enter_powerdown+0x4e/0x94
 [26143.165308] LR is at arch_counter_read+0x15/0x18
 [26143.178904] pc : [<c02d88fa>] lr : [<c02e7491>] psr: 60000173
 [26143.178904] sp : c05d5f30 ip : 5290e099 fp : c05d4000
 [26143.212742] r10: 00000000 r9 : c0614ac8 r8 : c06761e8
 [26143.228129] r7 : c0606f00 r6 : 00000001 r5 : c1696464 r4 : c1696460
 [26143.247360] r3 : 00000000 r2 : 278eecae r1 : 00000018 r0 : 00000000
 [26143.266591] Flags: nZCv IRQs on FIQs off Mode SVC_32 ISA Thumb Segment kernel
 [26143.288897] Control: 50c5387d Table: ad80006a DAC: 00000015

However now i have given more runs with both psci and non-psci firmware to see how reproducible is this across the firmware variants.

Please note that I normally run pm-qa tests also which has cpu_hotplug_08 script which does random hotplug of cpus over 100 iterations. Also the non-functional tests do have cpuhotplug tests which unplugs known cpus from either of the domains. One difference i noticed with your cpuhotplug suite is that you have got traces also enabled in this suite.

Anyhow the bottom line seems to be that with suite seems to provide the right conditions for triggering this failure. Shall keep posted on the results of further runs!