## Valgrind ==31831== Invalid read of size 1 ==31831== at 0x4C2E0F4: strlen (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so) ==31831== by 0x56FC243: find_mp_by_alias (structs.c:295) ==31831== by 0x404B2F: ev_add_map (main.c:264) ==31831== by 0x404A8B: uev_add_map (main.c:244) ==31831== by 0x40623C: uev_trigger (main.c:756) ==31831== by 0x5713958: service_uevq (uevent.c:118) ==31831== by 0x5713B67: uevent_dispatch (uevent.c:167) ==31831== by 0x406485: uevqloop (main.c:815) ==31831== by 0x4E3F183: start_thread (in /lib/x86_64-linux-gnu/libpthread-2.19.so) ==31831== by 0x5A2EBEC: clone (clone.S:111) ==31831== Address 0x728d8d1 is 1 bytes inside a block of size 6 free'd ==31831== at 0x4C2BDEC: free (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so) ==31831== by 0x404A9A: uev_add_map (main.c:245) ==31831== by 0x40623C: uev_trigger (main.c:756) ==31831== by 0x5713958: service_uevq (uevent.c:118) ==31831== by 0x5713B67: uevent_dispatch (uevent.c:167) ==31831== by 0x406485: uevqloop (main.c:815) ==31831== by 0x4E3F183: start_thread (in /lib/x86_64-linux-gnu/libpthread-2.19.so) ==31831== by 0x5A2EBEC: clone (clone.S:111) All the errors above exist because of a use-after-free caused by the code: static int uev_add_map (struct uevent * uev, struct vectors * vecs) { char *alias; int major = -1, minor = -1, rc; condlog(2, "%s: add map (uevent)", uev->kernel); alias = uevent_get_dm_name(uev); ... rc = ev_add_map(uev->kernel, alias, vecs); FREE(alias); <--- freed here return rc } Mainly because ev_add_map will call: /* * now we can register the map */ if (map_present && (mpp = add_map_without_path(vecs, alias))) { sync_map_state(mpp); condlog(2, "%s: devmap %s registered", alias, dev); return 0; } which will set: extern struct multipath * add_map_without_path (struct vectors * vecs, char * alias) { struct multipath * mpp = alloc_multipath(); if (!mpp) return NULL; mpp->alias = alias; So when this logic returns, mpp-> alias will be already freed by uev_add_map. Any other code traversing mpvecs using "mpp" as "struct multipath" will have pointer "alias" pointing into random memory causing all kinds of memory corruption. This logic is triggered every time the "ueventloop" gets a new uevent from kernel saying that there was a device map change. If device is "dm-" and its action is "change" (meaning a device mapper was changed), this logic will try to create a new "mpp" structure in global vecs->mpvec vectors with this discovered multipath. This is not used in daemon startup, since all paths are initially created by the configure() -> {path_discovery, map_discovery, coalesce_paths, coalesce_ maps}. Multipath devices are more susceptible to be created from the beginning, or by a discovered path, that would trigger another logic to for mpp structures (thus creating multipath devices): That might explain why this bug has not been fixed before. Example of crashes because of this: #0 malloc_consolidate (av=av@entry=0x7f5b58000020) at malloc.c:4149 #1 0x00007f5b62df3cf8 in _int_malloc (av=0x7f5b58000020, bytes=16384) at malloc.c:3423 #2 0x00007f5b62df66d0 in __GI___libc_malloc (bytes=16384) at malloc.c:2891 #3 0x00007f5b638134d7 in dm_task_run () from /lib/x86_64-linux-gnu/libdevmapper.so.1.02.1 #4 0x00007f5b6314be5c in dm_map_present (str=0x7f5b58000990 "lun02") at devmapper.c:304 #5 0x0000000000404ac7 in ev_add_map (dev=, alias=, vecs=) at main.c:257 #6 0x0000000000000000 in ?? () Another possible error is when trying to free mpp->alias again when garbage collector runs: #0 0x00007f13a5933c37 in __GI_raise (sig=sig@entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:56 #1 0x00007f13a5937028 in __GI_abort () at abort.c:89 #2 0x00007f13a59702a4 in __libc_message (do_abort=do_abort@entry=1, fmt=fmt@entry=0x7f13a5a81ef0 "") at ../sysdeps/posix/libc_fatal.c:175 #3 0x00007f13a597c56e in malloc_printerr (ptr=, str=0x7f13a5a82020 "double free or corruption (out)", action=1) at malloc.c:4996 #4 _int_free (av=, p=, have_lock=0) at malloc.c:3840 #5 0x00007f13a5cdbe86 in free_multipath (mpp=0x7f138c033d60, free_paths=0) at structs.c:174 #6 0x00007f13a5cfe117 in _remove_map (mpp=0x7f138c033d60, vecs=0x8adaa0, stop_waiter=1, purge_vec=1) at structs_vec.c:143 #7 0x00007f13a5cfe175 in remove_map_and_stop_waiter (mpp=0x7f138c033d60, vecs=0x8adaa0, purge_vec=1) at structs_vec.c:156 #8 0x0000000000406b4d in mpvec_garbage_collector (vecs=) at main.c:950 ... #14 0x00000000004076b7 in checkerloop (ap=) at main.c:1163 This can also be seen in valgrind: ==31831== ==31831== Invalid read of size 1 ==31831== at 0x4C2E902: strncmp (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so) ==31831== by 0x56FC26E: find_mp_by_alias (structs.c:296) ==31831== by 0x404B2F: ev_add_map (main.c:264) ==31831== by 0x404A8B: uev_add_map (main.c:244) ==31831== by 0x40623C: uev_trigger (main.c:756) ==31831== by 0x5713958: service_uevq (uevent.c:118) ==31831== by 0x5713B67: uevent_dispatch (uevent.c:167) ==31831== by 0x406485: uevqloop (main.c:815) ==31831== by 0x4E3F183: start_thread (in /lib/x86_64-linux-gnu/libpthread-2.19.so) ==31831== by 0x5A2EBEC: clone (clone.S:111) ==31831== Address 0x728d8d1 is 1 bytes inside a block of size 6 free'd ==31831== at 0x4C2BDEC: free (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so) ==31831== by 0x404A9A: uev_add_map (main.c:245) ==31831== by 0x40623C: uev_trigger (main.c:756) ==31831== by 0x5713958: service_uevq (uevent.c:118) ==31831== by 0x5713B67: uevent_dispatch (uevent.c:167) ==31831== by 0x406485: uevqloop (main.c:815) ==31831== by 0x4E3F183: start_thread (in /lib/x86_64-linux-gnu/libpthread-2.19.so) ==31831== by 0x5A2EBEC: clone (clone.S:111) mpvec_garbage_collector runs from the checkerloop, the thread responsible to check if all paths are up by using selected pathcheckers (readsector0, directio). Every 5 checker - all paths - loops, the garbage collector runs and it checks every multipath (from vecs->mpvec vectors) is still present in the OS by querying device mapper. If the "dm" (multipath) is gone, it removes the map from internal vector, stops waiter threads () and orphan multipath paths. When freeing multipath, the "mpp->alias" field is freed and, since it was already freed before, it gets freed again causing random errors. The following upstream patch, solves this by simply creating a new string that will me freed when appropriate. commit 735f2992398d4179601f3d7043fad606c90171d4 Author: Benjamin Marzinski