Race condition using ATOMIC_FASTBINS in _int_free causes crash or heap corruption
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
eglibc |
Fix Released
|
Medium
|
|||
eglibc (Ubuntu) |
Fix Released
|
Undecided
|
Adam Conrad | ||
Precise |
Fix Released
|
Undecided
|
Adam Conrad |
Bug Description
[Impact]
* This bug is likely to cause a crash with a SEGV in multithreading applications doing many memory deallocations with ATOMIC_FASTBINS feature enabled.
[Test Case]
* Since this is a race condition issue there is no simple path of reproducing it, however one could try to follow the instructions in the upstream bug (https:/
https:/
[Regression Potential]
* This issue has been merged upstream with no further issues reported.
[Other Info]
* Original bug description:
We have an application which makes heavy allocation and de-allocation demands from multiple threads. We run this application continuously on many servers, and once every several CPU months or years, we were getting a crash in _int_free that did not look like vanilla heap corruption. I believe I have narrowed it down to a race condition in _int_free due to the ATOMIC_FASTBINS feature. Basically, in the lockless FASTBIN _int_free path, a chunk is pulled into a local variable with the intent to add it to the fastbins list. However, the heap consolidation/trim code can race with this, and can coalesce the entire block and/or give it back to the OS before _int_free has a chance to try and store it into the fastbins list.
The problem is very challenging to reproduce in situ, but using gdb I have a recipe which demonstrates the crash 100% of the time on my 12.04 x64 system running eglibc 2.15. It relies on malloc_trim, although in our in situ data, the consolidation is triggered as a result of a normal free. malloc_trim is just easier to control.
While I am not a glibc developer, I could not see any easy ways to fix the situation shy of disabling ATOMIC_FASTBINS.
I am attaching the reproduction source. Other pertinent information follows:
> jpieper@
> Description: Ubuntu 12.04 LTS
> Release: 12.04
> jpieper@
> libc6:
> Installed: 2.15-0ubuntu10
> Candidate: 2.15-0ubuntu10
> Version table:
> *** 2.15-0ubuntu10 0
> 500 http://
> 100 /var/lib/
What I expect: I expect the attached application, when run using the gdb script in the comments, to complete with no failures.
What happened: A SIGSEGV after the final continue.
Changed in eglibc: | |
importance: | Unknown → Medium |
status: | Unknown → Confirmed |
Changed in eglibc: | |
status: | Confirmed → Incomplete |
Changed in eglibc: | |
status: | Incomplete → Fix Released |
Changed in eglibc: | |
status: | Fix Released → Confirmed |
Changed in eglibc (Ubuntu): | |
assignee: | nobody → Adam Conrad (adconrad) |
Changed in eglibc: | |
status: | Confirmed → Fix Released |
Changed in eglibc (Ubuntu): | |
status: | Confirmed → In Progress |
Changed in eglibc (Ubuntu Precise): | |
status: | New → In Progress |
assignee: | nobody → Adam Conrad (adconrad) |
description: | updated |
Status changed to 'Confirmed' because the bug affects multiple users.