strange malloc interaction with sysctl vm.overcommit_memory=2
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
glibc (Ubuntu) |
New
|
Undecided
|
Unassigned |
Bug Description
Binary package hint: libc6
Pertinent system information is:
smoot@smoot:~/tmp$ lsb_release -rd
Description: Ubuntu 8.10
Release: 8.10
smoot@smoot:~/tmp$ uname -a
Linux smoot 2.6.27-11-generic #1 SMP Thu Jan 29 19:28:32 UTC 2009 x86_64 GNU/Linux
It appears that malloc is inconsistent in its behavior when vm.overcommit_
#include <stdio.h>
#include <stdlib.h>
int main (void) {
int n = 0;
size_t size = 0x100000; /* e.g. 1MiB */
while (1) {
if (malloc(size) == NULL) {
}
}
}
This will allocate additional memory in 1MiB chunks. Even with overcommit_memory=2 and overcommit_
got 1 GiB 17ea000
got 2 GiB 17ea000
got 3 GiB 17ea000
got 4 GiB 17ea000
malloc failure after 4 GiB
I also changed the label strings on the printf statement in the program for clarity.
It appears that for large chunks, the kernel honors the overcommit_memory, but for smaller chunks, it ignores the overcommit flags. As an additional sanity test I performed the same test using sbrk() to allocate more space. This program is:
#include <stdio.h>
#include <stdlib.h>
int main (void) {
int n = 0;
size_t size = 0x100000;
while (1) {
if (sbrk(size) == -1) {
}
}
}
This program properly failed to allocate more VM at the appropriate point. The tail of the output was:
got 4361 11285000 MiB
got 4362 11385000 MiB
got 4363 11485000 MiB
got 4364 11585000 MiB
got 4365 11685000 MiB
got 4366 11785000 MiB
sbrk failure after 4366 MiB
I then looked at the current implementation of malloc() and confirmed it used both sbrk() and mmap() for VM allocation. mmap appears to be used for large chunks of VM. I also did an strace on the program which ignored the allocation limit. The relevant system call output looked like this:
mmap(NULL, 1052672, PROT_READ|
mmap(0x7f4f6400
mprotect(
write(1, "got 37889 MiB\n", 14) = 14
I do not understand all of the malloc() algorithm, but it appears with large chunks where the allocation fails using mmap, it falls back on allocation of a large block of VM with a call to mmap with the MAP_NORESERVE flag. This flag tells the kernel not to reserve swap space to backup the block of VM. I think this is the core of the problem reported above. I recompiled malloc() with that flag removed and the correct behavior appeared for the tested allocation chunk sizes.
I think MAP_NORESERVE needs to be removed from the mmap calls used in malloc(), so the overcommit_memory flag is respected when it is set to 2.