Comment 52 for bug 1097213

Revision history for this message
Chris Redpath (chris-redpath) wrote :

Hi Mathieu,

For me, the symptoms don't really support the scheduler being the cause of this issue, although I can completely accept that since we do more work on wakeup/reschedule we will change the relative probabilities of hitting any pre-existing race conditions. The sched clock accesses are restricted in the current patch set so that we don't use offline CPU clocks, so that old bug should not be present either.

Do you have a handle on how many of the crashes are in the timer softirq handler versus other bad-looking pointers? I see a few different ones over the course of the investigation here. Also, the previous successful reproduction on snowball without HMP present makes me think the issue might be a very tiny race window somewhere in generic hotplug code, possibly only on v7 platforms or the x86 people would have seen it.

I know you have looked in detail at this issue when it's been present in previous releases - did you do a review of the timer migration code for hotplug? It almost feels like some timer has not completed migrating before the softirq is delivered for it.