It's the first time it happened in the last 3 months of using this kernel, so it's not a huge issue at the moment, unless it starts to happen frequently, but it is concerning.
If this becomes a issue I could try the
vm.min_free_kbytes = 1000000
and
vm.zone_reclaim_mode = 1
but I'm concerned it may affect performance. Expecially the zone_reclaim parameter.
Not sure if it's the same issue, but we had an unexpected OOM with Ubuntu 16.04.3 LTS, 4.4.0-91.
Oct 31 23:52:25 db3 kernel: [6569272.882023] psql invoked oom-killer: gfp_mask=0x26000c0, order=2, oom_score_adj=0
...
Oct 31 23:52:25 db3 kernel: [6569272.882154] Mem-Info: anon:38011018 inactive_ anon:1422084 isolated_anon:0 file:11699125 inactive_ file:11727535 isolated_file:0 e:1455159 slab_unreclaima ble:533985 e:0kB slab_unreclaima ble:0kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes e:729192kB slab_unreclaima ble:35928kB kernel_stack:1920kB pagetables:415552kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no anon:58162056kB inactive_ anon:2546400kB active_ file:18254204kB inactive_ file:18282192kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:199229440kB managed:196081724kB mlocked:0kB dirty:152124kB writeback:4685924kB mapped:58223800kB shmem:58229824kB slab_reclaimabl e:2362116kB slab_unreclaima ble:1123984kB kernel_ stack:11056kB pagetables: 94580096kB unstable:22108kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no anon:93882008kB inactive_ anon:3141904kB active_ file:28542276kB inactive_ file:28627900kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:201326592kB managed:198178644kB mlocked:0kB dirty:199952kB writeback:6925996kB mapped:95773760kB shmem:95753948kB slab_reclaimabl e:2729328kB slab_unreclaima ble:976028kB kernel_stack:6608kB pagetables: 39753060kB unstable:71124kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no size=1048576kB size=2048kB size=1048576kB size=2048kB 1466410026
Oct 31 23:52:25 db3 kernel: [6569272.882165] active_
Oct 31 23:52:25 db3 kernel: [6569272.882165] active_
Oct 31 23:52:25 db3 kernel: [6569272.882165] unevictable:0 dirty:88019 writeback:2902991 unstable:23308
Oct 31 23:52:25 db3 kernel: [6569272.882165] slab_reclaimabl
Oct 31 23:52:25 db3 kernel: [6569272.882165] mapped:38499394 shmem:38495946 pagetables:33687177 bounce:0
Oct 31 23:52:25 db3 kernel: [6569272.882165] free:212612 free_pcp:0 free_cma:0
Oct 31 23:52:25 db3 kernel: [6569272.882172] Node 0 DMA free:13256kB min:0kB low:0kB high:0kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15976kB managed:15892kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimabl
Oct 31 23:52:25 db3 kernel: [6569272.882182] lowmem_reserve[]: 0 1882 193368 193368 193368
Oct 31 23:52:25 db3 kernel: [6569272.882188] Node 0 DMA32 free:768204kB min:316kB low:392kB high:472kB active_anon:8kB inactive_anon:32kB active_file:20kB inactive_file:48kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:2045556kB managed:1964868kB mlocked:0kB dirty:0kB writeback:44kB mapped:16kB shmem:12kB slab_reclaimabl
Oct 31 23:52:25 db3 kernel: [6569272.882196] lowmem_reserve[]: 0 0 191486 191486 191486
Oct 31 23:52:25 db3 kernel: [6569272.882201] Node 0 Normal free:34260kB min:32432kB low:40540kB high:48648kB active_
Oct 31 23:52:25 db3 kernel: [6569272.882210] lowmem_reserve[]: 0 0 0 0 0
Oct 31 23:52:25 db3 kernel: [6569272.882215] Node 1 Normal free:34728kB min:32780kB low:40972kB high:49168kB active_
Oct 31 23:52:25 db3 kernel: [6569272.882226] lowmem_reserve[]: 0 0 0 0 0
Oct 31 23:52:25 db3 kernel: [6569272.882230] Node 0 DMA: 0*4kB 1*8kB (U) 0*16kB 0*32kB 1*64kB (U) 1*128kB (U) 1*256kB (U) 1*512kB (U) 0*1024kB 2*2048kB (UM) 2*4096kB (M) = 13256kB
Oct 31 23:52:25 db3 kernel: [6569272.882248] Node 0 DMA32: 121*4kB (UME) 95*8kB (UME) 5337*16kB (UME) 4229*32kB (UME) 2523*64kB (UME) 624*128kB (UME) 237*256kB (UME) 83*512kB (UM) 51*1024kB (UM) 73*2048kB (UM) 0*4096kB = 768204kB
Oct 31 23:52:25 db3 kernel: [6569272.882268] Node 0 Normal: 8587*4kB (UM) 8*8kB (MH) 15*16kB (H) 8*32kB (H) 3*64kB (H) 1*128kB (H) 2*256kB (H) 1*512kB (H) 0*1024kB 0*2048kB 0*4096kB = 36252kB
Oct 31 23:52:25 db3 kernel: [6569272.882284] Node 1 Normal: 9063*4kB (UM) 0*8kB 7*16kB (H) 7*32kB (H) 5*64kB (H) 2*128kB (H) 0*256kB 1*512kB (H) 0*1024kB 0*2048kB 0*4096kB = 37676kB
Oct 31 23:52:25 db3 kernel: [6569272.882303] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_
Oct 31 23:52:25 db3 kernel: [6569272.882306] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_
Oct 31 23:52:25 db3 kernel: [6569272.882308] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_
Oct 31 23:52:25 db3 kernel: [6569272.882311] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_
Oct 31 23:52:25 db3 kernel: [6569272.882313] 61926313 total pagecache pages
Oct 31 23:52:25 db3 kernel: [6569272.882315] 3557 pages in swap cache
Oct 31 23:52:25 db3 kernel: [6569272.882318] Swap cache stats: add 169532062, delete 169528505, find 1411948632/
Oct 31 23:52:25 db3 kernel: [6569272.882319] Free swap = 121425844kB
Oct 31 23:52:25 db3 kernel: [6569272.882321] Total swap = 125001712kB
Oct 31 23:52:25 db3 kernel: [6569272.882323] 100654391 pages RAM
Oct 31 23:52:25 db3 kernel: [6569272.882325] 0 pages HighMem/MovableOnly
Oct 31 23:52:25 db3 kernel: [6569272.882327] 1594109 pages reserved
Oct 31 23:52:25 db3 kernel: [6569272.882328] 0 pages cma reserved
Oct 31 23:52:25 db3 kernel: [6569272.882330] 0 pages hwpoisoned
It's the first time it happened in the last 3 months of using this kernel, so it's not a huge issue at the moment, unless it starts to happen frequently, but it is concerning.
If this becomes a issue I could try the
vm.min_free_kbytes = 1000000 reclaim_ mode = 1
and
vm.zone_
but I'm concerned it may affect performance. Expecially the zone_reclaim parameter.
Anyway, I attached the full OOM log.