背景
今日在线上linux查看日志追踪一个问题的时候,钉钉突然接收到告警信息,提示服务crash。
追查过程
由于在JVM启动的时候,并未加入-XX:ErrorFile
参数,也就是说没有JVM crash的日志可以查看。服务器采用的是Centos的操作系统,就想到可以利用linux进程监控日志查看crash信息。
输入命令cat /var/log/messages
获得如下错误进程生命周期:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87
| May 13 13:46:19 xxxxxxxxxxxxxxxxxxxxxxx kernel: AliYunDun invoked oom-killer: gfp_mask=0x201da, order=0, oom_score_adj=0 ① May 13 13:46:19 xxxxxxxxxxxxxxxxxxxxxxx kernel: AliYunDun cpuset=/ mems_allowed=0 May 13 13:46:19 xxxxxxxxxxxxxxxxxxxxxxx kernel: CPU: 1 PID: 11084 Comm: AliYunDun Not tainted 3.x.x-x.x.x.x.x86_64 #1 May 13 13:46:19 xxxxxxxxxxxxxxxxxxxxxxx kernel: Hardware name: Alibaba Cloud Alibaba Cloud ECS, BIOS rel-1.7.5-0-xxxxxx-20xxxxxx_164612-nilsson.home.kraxel.org xx/xx/xxxx ... May 13 13:46:19 xxxxxxxxxxxxxxxxxxxxxxx kernel: Call Trace: May 13 13:46:19 xxxxxxxxxxxxxxxxxxxxxxx kernel: [<ffffffff816a3db1>] dump_stack+0x19/0x1b May 13 13:46:19 xxxxxxxxxxxxxxxxxxxxxxx kernel: [<ffffffff8169f1a6>] dump_header+0x90/0x229 May 13 13:46:19 xxxxxxxxxxxxxxxxxxxxxxx kernel: [<ffffffff810e939c>] ? ktime_get_ts64+0x4c/0xf0 May 13 13:46:20 xxxxxxxxxxxxxxxxxxxxxxx kernel: [<ffffffff8113d36f>] ? delayacct_end+0x8f/0xb0 May 13 13:46:20 xxxxxxxxxxxxxxxxxxxxxxx kernel: [<ffffffff81186394>] oom_kill_process+0x254/0x3d0 May 13 13:46:20 xxxxxxxxxxxxxxxxxxxxxxx kernel: [<ffffffff81185e3d>] ? oom_unkillable_task+0xcd/0x120 May 13 13:46:20 xxxxxxxxxxxxxxxxxxxxxxx kernel: [<ffffffff81185ee6>] ? find_lock_task_mm+0x56/0xc0 May 13 13:46:20 xxxxxxxxxxxxxxxxxxxxxxx kernel: [<ffffffff81186bd6>] out_of_memory+0x4b6/0x4f0 May 13 13:46:20 xxxxxxxxxxxxxxxxxxxxxxx kernel: [<ffffffff8169fcaa>] __alloc_pages_slowpath+0x5d6/0x724 May 13 13:46:20 xxxxxxxxxxxxxxxxxxxxxxx kernel: [<ffffffff8118cd85>] __alloc_pages_nodemask+0x405/0x420 May 13 13:46:20 xxxxxxxxxxxxxxxxxxxxxxx kernel: [<ffffffff811d1108>] alloc_pages_current+0x98/0x110 May 13 13:46:20 xxxxxxxxxxxxxxxxxxxxxxx kernel: [<ffffffff81182917>] __page_cache_alloc+0x97/0xb0 May 13 13:46:20 xxxxxxxxxxxxxxxxxxxxxxx kernel: [<ffffffff81184eb0>] filemap_fault+0x170/0x410 May 13 13:46:20 xxxxxxxxxxxxxxxxxxxxxxx kernel: [<ffffffffc01be156>] ext4_filemap_fault+0x36/0x50 [ext4] May 13 13:46:20 xxxxxxxxxxxxxxxxxxxxxxx kernel: [<ffffffff811ad162>] __do_fault+0x52/0xe0 May 13 13:46:20 xxxxxxxxxxxxxxxxxxxxxxx kernel: [<ffffffff811ad60b>] do_read_fault.isra.44+0x4b/0x130 May 13 13:46:20 xxxxxxxxxxxxxxxxxxxxxxx kernel: [<ffffffff811b1f11>] handle_mm_fault+0x691/0xfa0 May 13 13:46:20 xxxxxxxxxxxxxxxxxxxxxxx kernel: [<ffffffff816a8fad>] ? __schedule+0x39d/0x8b0 May 13 13:46:20 xxxxxxxxxxxxxxxxxxxxxxx kernel: [<ffffffff816afff4>] __do_page_fault+0x154/0x450 May 13 13:46:20 xxxxxxxxxxxxxxxxxxxxxxx kernel: [<ffffffff816b03d6>] trace_do_page_fault+0x56/0x150 May 13 13:46:20 xxxxxxxxxxxxxxxxxxxxxxx kernel: [<ffffffff816afa6a>] do_async_page_fault+0x1a/0xd0 May 13 13:46:20 xxxxxxxxxxxxxxxxxxxxxxx kernel: [<ffffffff816ac578>] async_page_fault+0x28/0x30 May 13 13:46:20 xxxxxxxxxxxxxxxxxxxxxxx kernel: Mem-Info: May 13 13:46:20 xxxxxxxxxxxxxxxxxxxxxxx kernel: active_anon:908086 inactive_anon:113 isolated_anon:0#012 active_file:1161 inactive_file:3251 isolated_file:27#012 unevictable:0 dirty:6 writeback:0 unstable:0#012 slab_reclaimable:14131 slab_unreclaimable:4393#012 mapped:372 shmem:188 pagetables:3329 bounce:0#012 free:22809 free_pcp:71 free_cma:0 May 13 13:46:20 xxxxxxxxxxxxxxxxxxxxxxx kernel: Node 0 DMA free:15360kB min:276kB low:344kB high:412kB active_anon:384kB inactive_anon:0kB active_file:4kB inactive_file:4kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15992kB managed:15908kB mlocked:0kB dirty:0kB writeback:0kB mapped:4kB shmem:0kB slab_reclaimable:24kB slab_unreclaimable:32kB kernel_stack:0kB pagetables:8kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:696 all_unreclaimable? yes May 13 13:46:20 xxxxxxxxxxxxxxxxxxxxxxx kernel: lowmem_reserve[]: 0 2814 3773 3773 May 13 13:46:20 xxxxxxxxxxxxxxxxxxxxxxx kernel: Node 0 DMA32 free:53992kB min:50208kB low:62760kB high:75312kB active_anon:2728428kB inactive_anon:400kB active_file:4144kB inactive_file:15476kB unevictable:0kB isolated(anon):0kB isolated(file):108kB present:3129216kB managed:2884232kB mlocked:0kB dirty:24kB writeback:0kB mapped:1456kB shmem:648kB slab_reclaimable:40652kB slab_unreclaimable:9472kB kernel_stack:7824kB pagetables:10588kB unstable:0kB bounce:0kB free_pcp:360kB local_pcp:120kB free_cma:0kB writeback_tmp:0kB pages_scanned:19438 all_unreclaimable? yes May 13 13:46:20 xxxxxxxxxxxxxxxxxxxxxxx kernel: lowmem_reserve[]: 0 0 958 958 May 13 13:46:20 xxxxxxxxxxxxxxxxxxxxxxx kernel: Node 0 Normal free:17048kB min:17096kB low:21368kB high:25644kB active_anon:904132kB inactive_anon:52kB active_file:496kB inactive_file:624kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:1048576kB managed:981548kB mlocked:0kB dirty:0kB writeback:0kB mapped:28kB shmem:104kB slab_reclaimable:15848kB slab_unreclaimable:8068kB kernel_stack:2352kB pagetables:2720kB unstable:0kB bounce:0kB free_pcp:144kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:3131 all_unreclaimable? yes May 13 13:46:20 xxxxxxxxxxxxxxxxxxxxxxx kernel: lowmem_reserve[]: 0 0 0 0 May 13 13:46:20 xxxxxxxxxxxxxxxxxxxxxxx kernel: Node 0 DMA: xxxx... May 13 13:46:20 xxxxxxxxxxxxxxxxxxxxxxx kernel: Node 0 DMA32: xxxx... May 13 13:46:20 xxxxxxxxxxxxxxxxxxxxxxx kernel: Node 0 Normal: xxxx... May 13 13:46:20 xxxxxxxxxxxxxxxxxxxxxxx kernel: Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB May 13 13:46:20 xxxxxxxxxxxxxxxxxxxxxxx kernel: Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB May 13 13:46:20 xxxxxxxxxxxxxxxxxxxxxxx kernel: 5490 total pagecache pages May 13 13:46:20 xxxxxxxxxxxxxxxxxxxxxxx kernel: 0 pages in swap cache May 13 13:46:20 xxxxxxxxxxxxxxxxxxxxxxx kernel: Swap cache stats: add 0, delete 0, find 0/0 May 13 13:46:20 xxxxxxxxxxxxxxxxxxxxxxx kernel: Free swap = 0kB May 13 13:46:20 xxxxxxxxxxxxxxxxxxxxxxx kernel: Total swap = 0kB May 13 13:46:20 xxxxxxxxxxxxxxxxxxxxxxx kernel: 1048446 pages RAM May 13 13:46:20 xxxxxxxxxxxxxxxxxxxxxxx kernel: 0 pages HighMem/MovableOnly May 13 13:46:20 xxxxxxxxxxxxxxxxxxxxxxx kernel: 78024 pages reserved May 13 13:46:20 xxxxxxxxxxxxxxxxxxxxxxx kernel: [ pid ] uid tgid total_vm rss nr_ptes swapents oom_score_adj name May 13 13:46:20 xxxxxxxxxxxxxxxxxxxxxxx kernel: [ 335] 0 335 21498 88 42 0 0 systemd-journal May 13 13:46:20 xxxxxxxxxxxxxxxxxxxxxxx kernel: [ 357] 0 357 49101 93 28 0 0 lvmetad May 13 13:46:20 xxxxxxxxxxxxxxxxxxxxxxx kernel: [ 366] 0 366 11047 223 21 0 -1000 systemd-udevd May 13 13:46:20 xxxxxxxxxxxxxxxxxxxxxxx kernel: [ 448] 0 448 5402 70 16 0 0 irqbalance May 13 13:46:20 xxxxxxxxxxxxxxxxxxxxxxx kernel: [ 450] 81 450 6104 100 18 0 -900 dbus-daemon May 13 13:46:20 xxxxxxxxxxxxxxxxxxxxxxx kernel: [ 452] 997 452 133564 2422 56 0 0 polkitd May 13 13:46:20 xxxxxxxxxxxxxxxxxxxxxxx kernel: [ 454] 0 454 6110 137 18 0 0 systemd-logind May 13 13:46:20 xxxxxxxxxxxxxxxxxxxxxxx kernel: [ 461] 0 461 31559 163 19 0 0 crond May 13 13:46:20 xxxxxxxxxxxxxxxxxxxxxxx kernel: [ 462] 0 462 6464 52 18 0 0 atd May 13 13:46:20 xxxxxxxxxxxxxxxxxxxxxxx kernel: [ 487] 0 487 27511 32 10 0 0 agetty May 13 13:46:20 xxxxxxxxxxxxxxxxxxxxxxx kernel: [ 488] 0 488 27511 33 11 0 0 agetty May 13 13:46:20 xxxxxxxxxxxxxxxxxxxxxxx kernel: [ 502] 38 502 11282 188 28 0 0 ntpd May 13 13:46:20 xxxxxxxxxxxxxxxxxxxxxxx kernel: [ 699] 0 699 26499 255 53 0 -1000 sshd May 13 13:46:20 xxxxxxxxxxxxxxxxxxxxxxx kernel: [ 700] 0 700 140609 3183 91 0 0 tuned May 13 13:46:20 xxxxxxxxxxxxxxxxxxxxxxx kernel: [ 3171] 0 3171 29478 157 12 0 0 wrapper May 13 13:46:20 xxxxxxxxxxxxxxxxxxxxxxx kernel: [ 3173] 0 3173 639160 17634 93 0 0 java May 13 13:46:20 xxxxxxxxxxxxxxxxxxxxxxx kernel: [18771] 0 18771 12428 375 22 0 0 ilogtail_0.16.1 May 13 13:46:20 xxxxxxxxxxxxxxxxxxxxxxx kernel: [18772] 0 18772 35548 1153 49 0 0 ilogtail_0.16.1 May 13 13:46:20 xxxxxxxxxxxxxxxxxxxxxxx kernel: [11045] 0 11045 8187 424 21 0 0 AliYunDunUpdate May 13 13:46:20 xxxxxxxxxxxxxxxxxxxxxxx kernel: [11063] 0 11063 34061 2169 63 0 0 AliYunDun May 13 13:46:20 xxxxxxxxxxxxxxxxxxxxxxx kernel: [28916] 0 28916 158555 797 165 0 0 rsyslogd May 13 13:46:20 xxxxxxxxxxxxxxxxxxxxxxx kernel: [20779] 996 20779 24617 279 48 0 0 xxx_agentd May 13 13:46:20 xxxxxxxxxxxxxxxxxxxxxxx kernel: [20780] 996 20780 24642 362 48 0 0 xxx_agentd May 13 13:46:20 xxxxxxxxxxxxxxxxxxxxxxx kernel: [20781] 996 20781 24642 311 48 0 0 xxx_agentd May 13 13:46:20 xxxxxxxxxxxxxxxxxxxxxxx kernel: [20782] 996 20782 24642 323 48 0 0 xxx_agentd May 13 13:46:20 xxxxxxxxxxxxxxxxxxxxxxx kernel: [20783] 996 20783 24642 305 48 0 0 xxx_agentd May 13 13:46:20 xxxxxxxxxxxxxxxxxxxxxxx kernel: [20784] 996 20784 24617 287 48 0 0 xxx_agentd May 13 13:46:20 xxxxxxxxxxxxxxxxxxxxxxx kernel: [14212] 0 14212 37927 840 24 0 0 xxxxxx-collect May 13 13:46:20 xxxxxxxxxxxxxxxxxxxxxxx kernel: [ 6680] 0 6680 74406 5748 60 0 0 filexxxx May 13 13:46:20 xxxxxxxxxxxxxxxxxxxxxxx kernel: [28753] 0 28753 10448 216 13 0 0 aliyun-service May 13 13:46:20 xxxxxxxxxxxxxxxxxxxxxxx kernel: [28775] 1000 28775 1303596 610152 1401 0 0 java May 13 13:46:20 xxxxxxxxxxxxxxxxxxxxxxx kernel: [29015] 0 29015 36554 335 73 0 0 sshd May 13 13:46:20 xxxxxxxxxxxxxxxxxxxxxxx kernel: [29020] 1000 29020 36554 325 71 0 0 sshd May 13 13:46:20 xxxxxxxxxxxxxxxxxxxxxxx kernel: [29021] 1000 29021 28848 113 13 0 0 bash May 13 13:46:20 xxxxxxxxxxxxxxxxxxxxxxx kernel: [29143] 1000 29143 288690 251514 517 0 0 vim May 13 13:46:20 xxxxxxxxxxxxxxxxxxxxxxx kernel: Out of memory: Kill process 28775 (java) score 630 or sacrifice child ② May 13 13:46:20 xxxxxxxxxxxxxxxxxxxxxxx kernel: Killed process 28775 (java) total-vm:5214384kB, anon-rss:2440608kB, file-rss:0kB, shmem-rss:0kB
|
在①处可以看到AliyunDun的一个日志,在这里触发了一个OOM Killer。然后干掉了OOM Score最高的Java 进程。
而在②处可以看到OOM Killer真实的杀死了进程ID为28775的java进程。
想了下,13::46左右,刚好打开了一个2G的日志文件,很可能是这个原因,造成AliyunDun申请内存不足,因此触发OOM Killer杀死进程。
什么是OOM Killer
OOM Killer 顾名思义即为在out of memory的情况下杀死占内存最多又不重要的进程的程序。
为什么会存在OOM Killer
这个问题源于Linux的内存申请机制,它执行的是一个积极的内存申请策略。
即:执行maclloc()
返回非null的地址并不能保证内存是真实可用的(在maclloc()
申请内存的机制中描述这是个非常坑的BUG),因为linux允许程序申请比系统可用空间更多的空间,该特性被称为overcommit
特性,这样做是为了系统优化。并且并不是所有程序申请了内存就会立刻使用,当真正使用的时候,系统可能已经回收一部分内存。而一旦真正使用的时候发现没有内存,系统将会为了防止系统耗尽内存而触发OOM Killer,这个时候将会萱蕚性的杀死一个甚至多个进程,用于给新程序腾出空间。
linux可以选择overcommit
的策略,甚至直接禁用overcommit
。
但是只要启用overcommit
策略,那么必定会有OOM Killer。OOM Killer可以根据一些用户定义的策略来决定哪些进程需要被kill掉。它的底层通过/proc/<pid>/oom_score
值来判断哪个进程需要被干掉。如下描述一些相关策略以及对应文件:
/proc/<pid>/oom_score_adj
:用于在计算得到最终badness score
的时候累加。-1000代表永远不会被kill;+1000代表最大可能被kill/proc/<pid>/oom_adj
:为了与旧版本操作系统兼容,写入或者读取的时候会按照比例写入读取到oom_score_adj
文件;可能存在一定的精度丢失的情况发生/proc/<pid>/oom_score
:最终的分数,用于判断哪个进程需要被kill
OOM kill的内核实现
OOM的调用关系:
1 2
| __alloc_pages_nodemask -> __alloc_pages_slowpath -> __alloc_pages_may_oom -> out_of_memory -> __out_of_memory -> select_bad_process -> oom_kill_process
|
其逻辑操作如下:
- 首先通过函数 blocking_notifier_call_chain 遍历用户注册的通知链函数,如果通知链的callback函数能够处理OOM,则直接退出OOM killer操作。
- 如果引起OOM的进程(current)拥有pending SIGKILL,或者正在退出,则选择current进程。
- 检查发生OOM时,内核是否需要panic,如果panic,这后续代码就不会执行。
- 如果设置了 sysctl_oom_kill_allocating_task ,并且 current->mm 不为空,current的 oom_score_adj != OOM_SCORE_ADJ_MIN ,且可以杀死current,则直接杀死current进程,释放内存。
- 调用 select_bad_process 选择一个最优的进程p去杀
- 如果 p == null, 即没有进程可杀,内核则直接panic,否则调用函数 oom_kill_process 去kill选择选择的进程p。
select_bad_process
操作:
- 该函数遍历所有的进程和线程 for_each_process_thread(g, p)
- 针对每一个线程:该函数先利用 oom_scan_process_thread 检查线程的类别,排除一些特殊的线程,然后对可以作为候选的线程进行评分。
- 最后返回评分最高的线程。
注意:第2步骤中,排除的线程包括:
- init进程
- kernel thread线程
- 进程没有可以释放的内存
- 不是引起oom的memcg中的进程
- 进程正被OOM kill
- …
如何防止进程被Kill
/proc/<pid>/oom_score_adj
设置值为-1000/proc/<pid>/oom_adj
设置值为-17- 完全关闭OOM Killer,
# sysctl -w vm.overcommit_memory=2 # echo "vm.overcommit_memory=2" >> /etc/sysctl.conf
,不推荐用于生产环境。可能造成内存混乱。
总结
- 对于服务进程,需要针对性做一些措施,防止由于某些原因造成的内存OOM情况。
- 在日志打印上,还需要控制单日志文件的大小,超过的做分割
参考资料