I can corroborate to what Max said, and proposed, since I'm analysing a dump that was brought to me, exactly in this situation (and also could find another old bug with the same race condition):
Had a "cache->rmm_addr" with no lock at "find_block_of_size()"
cache->rmm_addr->lock { type = apr_anylock_none }
And an invalid "next" offset (out of rmm->base->firstfree).
This rmm_addr was initialized with NULL as a locking mechanism:
From apr-utils:
apr_rmm_init()
if (!lock) { <-- 2nd argument to apr_rmm_init() nulllock.type = apr_anylock_none; <--- found in the dump nulllock.lock.pm = NULL;
lock = &nulllock;
}
From apache:
# mod_auth_digest
sts = apr_rmm_init(&client_rmm, NULL, /* no lock, we'll do the locking ourselves */ apr_shm_baseaddr_get(client_shm), shmem_size, ctx);
# util_ldap_cache
result = apr_rmm_init(&st->cache_rmm, NULL, apr_shm_baseaddr_get(st->cache_shm), size, st->pool);
It appears that the ldap module chose to use "rmm" for memory allocation, using
the shared memory approach, but without explicitly definiting a lock to it.
Without it, its up to the caller to guarantee that there are locks for rmm
synchronization (just like mod_auth_digest does, using global mutexes).
Because of that, there was a race condition in "find_block_of_size" and a call
touching "rmm->base->firstfree", possibly "move_block()", in a multi-threaded
apache environment, since there were no lock guarantees inside rmm logic (lock
was "apr_anylock_none" and the locking calls don't do anything).
In find_block_of_size:
apr_rmm_off_t next = rmm->base->firstfree;
We have:
rmm->base->firstfree
Decimal:356400
Hex:0x57030
But "next" turned into:
Name : next
Decimal:8320808657351632189
Hex:0x737973636970653d
Max or Eric,
Are there any changes in status of this bug ?
I can corroborate to what Max said, and proposed, since I'm analysing a dump that was brought to me, exactly in this situation (and also could find another old bug with the same race condition):
https:/ /bz.apache. org/bugzilla/ show_bug. cgi?id= 58483
These are my notes so far:
Problem summary:
apr_rmm_init acts as a relocatable memory management initialization
it is used in: mod_auth_digest and util_ldap_cache
From the dump was brought to my knowledge, in the following sequence:
- util_ldap_ compare_ node_copy( ) of_size( )
- util_ald_strdup()
- apr_rmm_calloc()
- find_block_
Had a "cache->rmm_addr" with no lock at "find_block_ of_size( )"
cache-> rmm_addr- >lock { type = apr_anylock_none }
And an invalid "next" offset (out of rmm->base- >firstfree) .
This rmm_addr was initialized with NULL as a locking mechanism:
From apr-utils:
apr_rmm_init()
if (!lock) { <-- 2nd argument to apr_rmm_init()
nulllock. type = apr_anylock_none; <--- found in the dump
nulllock. lock.pm = NULL;
lock = &nulllock;
}
From apache:
# mod_auth_digest
sts = apr_rmm_ init(&client_ rmm,
NULL, /* no lock, we'll do the locking ourselves */
apr_shm_ baseaddr_ get(client_ shm),
shmem_ size, ctx);
# util_ldap_cache
result = apr_rmm_ init(&st- >cache_ rmm, NULL,
apr_shm_ baseaddr_ get(st- >cache_ shm), size,
st->pool) ;
It appears that the ldap module chose to use "rmm" for memory allocation, using
the shared memory approach, but without explicitly definiting a lock to it.
Without it, its up to the caller to guarantee that there are locks for rmm
synchronization (just like mod_auth_digest does, using global mutexes).
Because of that, there was a race condition in "find_block_ of_size" and a call >firstfree" , possibly "move_block()", in a multi-threaded
touching "rmm->base-
apache environment, since there were no lock guarantees inside rmm logic (lock
was "apr_anylock_none" and the locking calls don't do anything).
In find_block_of_size:
apr_rmm_off_t next = rmm->base- >firstfree;
We have:
rmm- >base-> firstfree
Decimal:356400
Hex:0x57030
But "next" turned into:
Name : next 832080865735163 2189 6970653d
Decimal:
Hex:0x73797363
Causing:
struct rmm_block_t *blk = (rmm_block_ t*)((char* )rmm->base + next);
if (blk->size == size)
To segfault.