WhatAKitty Daily

A Programmer's Daily Record

一次JVM-Crash分析

WhatAKitty   阅读次数loading...

背景

今日在线上linux查看日志追踪一个问题的时候,钉钉突然接收到告警信息,提示服务crash。

追查过程

由于在JVM启动的时候,并未加入-XX:ErrorFile参数,也就是说没有JVM crash的日志可以查看。服务器采用的是Centos的操作系统,就想到可以利用linux进程监控日志查看crash信息。

输入命令cat /var/log/messages获得如下错误进程生命周期:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
May 13 13:46:19 xxxxxxxxxxxxxxxxxxxxxxx kernel: AliYunDun invoked oom-killer: gfp_mask=0x201da, order=0, oom_score_adj=0   ①
May 13 13:46:19 xxxxxxxxxxxxxxxxxxxxxxx kernel: AliYunDun cpuset=/ mems_allowed=0
May 13 13:46:19 xxxxxxxxxxxxxxxxxxxxxxx kernel: CPU: 1 PID: 11084 Comm: AliYunDun Not tainted 3.x.x-x.x.x.x.x86_64 #1
May 13 13:46:19 xxxxxxxxxxxxxxxxxxxxxxx kernel: Hardware name: Alibaba Cloud Alibaba Cloud ECS, BIOS rel-1.7.5-0-xxxxxx-20xxxxxx_164612-nilsson.home.kraxel.org xx/xx/xxxx
...
May 13 13:46:19 xxxxxxxxxxxxxxxxxxxxxxx kernel: Call Trace:
May 13 13:46:19 xxxxxxxxxxxxxxxxxxxxxxx kernel: [<ffffffff816a3db1>] dump_stack+0x19/0x1b
May 13 13:46:19 xxxxxxxxxxxxxxxxxxxxxxx kernel: [<ffffffff8169f1a6>] dump_header+0x90/0x229
May 13 13:46:19 xxxxxxxxxxxxxxxxxxxxxxx kernel: [<ffffffff810e939c>] ? ktime_get_ts64+0x4c/0xf0
May 13 13:46:20 xxxxxxxxxxxxxxxxxxxxxxx kernel: [<ffffffff8113d36f>] ? delayacct_end+0x8f/0xb0
May 13 13:46:20 xxxxxxxxxxxxxxxxxxxxxxx kernel: [<ffffffff81186394>] oom_kill_process+0x254/0x3d0
May 13 13:46:20 xxxxxxxxxxxxxxxxxxxxxxx kernel: [<ffffffff81185e3d>] ? oom_unkillable_task+0xcd/0x120
May 13 13:46:20 xxxxxxxxxxxxxxxxxxxxxxx kernel: [<ffffffff81185ee6>] ? find_lock_task_mm+0x56/0xc0
May 13 13:46:20 xxxxxxxxxxxxxxxxxxxxxxx kernel: [<ffffffff81186bd6>] out_of_memory+0x4b6/0x4f0
May 13 13:46:20 xxxxxxxxxxxxxxxxxxxxxxx kernel: [<ffffffff8169fcaa>] __alloc_pages_slowpath+0x5d6/0x724
May 13 13:46:20 xxxxxxxxxxxxxxxxxxxxxxx kernel: [<ffffffff8118cd85>] __alloc_pages_nodemask+0x405/0x420
May 13 13:46:20 xxxxxxxxxxxxxxxxxxxxxxx kernel: [<ffffffff811d1108>] alloc_pages_current+0x98/0x110
May 13 13:46:20 xxxxxxxxxxxxxxxxxxxxxxx kernel: [<ffffffff81182917>] __page_cache_alloc+0x97/0xb0
May 13 13:46:20 xxxxxxxxxxxxxxxxxxxxxxx kernel: [<ffffffff81184eb0>] filemap_fault+0x170/0x410
May 13 13:46:20 xxxxxxxxxxxxxxxxxxxxxxx kernel: [<ffffffffc01be156>] ext4_filemap_fault+0x36/0x50 [ext4]
May 13 13:46:20 xxxxxxxxxxxxxxxxxxxxxxx kernel: [<ffffffff811ad162>] __do_fault+0x52/0xe0
May 13 13:46:20 xxxxxxxxxxxxxxxxxxxxxxx kernel: [<ffffffff811ad60b>] do_read_fault.isra.44+0x4b/0x130
May 13 13:46:20 xxxxxxxxxxxxxxxxxxxxxxx kernel: [<ffffffff811b1f11>] handle_mm_fault+0x691/0xfa0
May 13 13:46:20 xxxxxxxxxxxxxxxxxxxxxxx kernel: [<ffffffff816a8fad>] ? __schedule+0x39d/0x8b0
May 13 13:46:20 xxxxxxxxxxxxxxxxxxxxxxx kernel: [<ffffffff816afff4>] __do_page_fault+0x154/0x450
May 13 13:46:20 xxxxxxxxxxxxxxxxxxxxxxx kernel: [<ffffffff816b03d6>] trace_do_page_fault+0x56/0x150
May 13 13:46:20 xxxxxxxxxxxxxxxxxxxxxxx kernel: [<ffffffff816afa6a>] do_async_page_fault+0x1a/0xd0
May 13 13:46:20 xxxxxxxxxxxxxxxxxxxxxxx kernel: [<ffffffff816ac578>] async_page_fault+0x28/0x30
May 13 13:46:20 xxxxxxxxxxxxxxxxxxxxxxx kernel: Mem-Info:
May 13 13:46:20 xxxxxxxxxxxxxxxxxxxxxxx kernel: active_anon:908086 inactive_anon:113 isolated_anon:0#012 active_file:1161 inactive_file:3251 isolated_file:27#012 unevictable:0 dirty:6 writeback:0 unstable:0#012 slab_reclaimable:14131 slab_unreclaimable:4393#012 mapped:372 shmem:188 pagetables:3329 bounce:0#012 free:22809 free_pcp:71 free_cma:0
May 13 13:46:20 xxxxxxxxxxxxxxxxxxxxxxx kernel: Node 0 DMA free:15360kB min:276kB low:344kB high:412kB active_anon:384kB inactive_anon:0kB active_file:4kB inactive_file:4kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15992kB managed:15908kB mlocked:0kB dirty:0kB writeback:0kB mapped:4kB shmem:0kB slab_reclaimable:24kB slab_unreclaimable:32kB kernel_stack:0kB pagetables:8kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:696 all_unreclaimable? yes
May 13 13:46:20 xxxxxxxxxxxxxxxxxxxxxxx kernel: lowmem_reserve[]: 0 2814 3773 3773
May 13 13:46:20 xxxxxxxxxxxxxxxxxxxxxxx kernel: Node 0 DMA32 free:53992kB min:50208kB low:62760kB high:75312kB active_anon:2728428kB inactive_anon:400kB active_file:4144kB inactive_file:15476kB unevictable:0kB isolated(anon):0kB isolated(file):108kB present:3129216kB managed:2884232kB mlocked:0kB dirty:24kB writeback:0kB mapped:1456kB shmem:648kB slab_reclaimable:40652kB slab_unreclaimable:9472kB kernel_stack:7824kB pagetables:10588kB unstable:0kB bounce:0kB free_pcp:360kB local_pcp:120kB free_cma:0kB writeback_tmp:0kB pages_scanned:19438 all_unreclaimable? yes
May 13 13:46:20 xxxxxxxxxxxxxxxxxxxxxxx kernel: lowmem_reserve[]: 0 0 958 958
May 13 13:46:20 xxxxxxxxxxxxxxxxxxxxxxx kernel: Node 0 Normal free:17048kB min:17096kB low:21368kB high:25644kB active_anon:904132kB inactive_anon:52kB active_file:496kB inactive_file:624kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:1048576kB managed:981548kB mlocked:0kB dirty:0kB writeback:0kB mapped:28kB shmem:104kB slab_reclaimable:15848kB slab_unreclaimable:8068kB kernel_stack:2352kB pagetables:2720kB unstable:0kB bounce:0kB free_pcp:144kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:3131 all_unreclaimable? yes
May 13 13:46:20 xxxxxxxxxxxxxxxxxxxxxxx kernel: lowmem_reserve[]: 0 0 0 0
May 13 13:46:20 xxxxxxxxxxxxxxxxxxxxxxx kernel: Node 0 DMA: xxxx...
May 13 13:46:20 xxxxxxxxxxxxxxxxxxxxxxx kernel: Node 0 DMA32: xxxx...
May 13 13:46:20 xxxxxxxxxxxxxxxxxxxxxxx kernel: Node 0 Normal: xxxx...
May 13 13:46:20 xxxxxxxxxxxxxxxxxxxxxxx kernel: Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
May 13 13:46:20 xxxxxxxxxxxxxxxxxxxxxxx kernel: Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
May 13 13:46:20 xxxxxxxxxxxxxxxxxxxxxxx kernel: 5490 total pagecache pages
May 13 13:46:20 xxxxxxxxxxxxxxxxxxxxxxx kernel: 0 pages in swap cache
May 13 13:46:20 xxxxxxxxxxxxxxxxxxxxxxx kernel: Swap cache stats: add 0, delete 0, find 0/0
May 13 13:46:20 xxxxxxxxxxxxxxxxxxxxxxx kernel: Free swap = 0kB
May 13 13:46:20 xxxxxxxxxxxxxxxxxxxxxxx kernel: Total swap = 0kB
May 13 13:46:20 xxxxxxxxxxxxxxxxxxxxxxx kernel: 1048446 pages RAM
May 13 13:46:20 xxxxxxxxxxxxxxxxxxxxxxx kernel: 0 pages HighMem/MovableOnly
May 13 13:46:20 xxxxxxxxxxxxxxxxxxxxxxx kernel: 78024 pages reserved
May 13 13:46:20 xxxxxxxxxxxxxxxxxxxxxxx kernel: [ pid ] uid tgid total_vm rss nr_ptes swapents oom_score_adj name
May 13 13:46:20 xxxxxxxxxxxxxxxxxxxxxxx kernel: [ 335] 0 335 21498 88 42 0 0 systemd-journal
May 13 13:46:20 xxxxxxxxxxxxxxxxxxxxxxx kernel: [ 357] 0 357 49101 93 28 0 0 lvmetad
May 13 13:46:20 xxxxxxxxxxxxxxxxxxxxxxx kernel: [ 366] 0 366 11047 223 21 0 -1000 systemd-udevd
May 13 13:46:20 xxxxxxxxxxxxxxxxxxxxxxx kernel: [ 448] 0 448 5402 70 16 0 0 irqbalance
May 13 13:46:20 xxxxxxxxxxxxxxxxxxxxxxx kernel: [ 450] 81 450 6104 100 18 0 -900 dbus-daemon
May 13 13:46:20 xxxxxxxxxxxxxxxxxxxxxxx kernel: [ 452] 997 452 133564 2422 56 0 0 polkitd
May 13 13:46:20 xxxxxxxxxxxxxxxxxxxxxxx kernel: [ 454] 0 454 6110 137 18 0 0 systemd-logind
May 13 13:46:20 xxxxxxxxxxxxxxxxxxxxxxx kernel: [ 461] 0 461 31559 163 19 0 0 crond
May 13 13:46:20 xxxxxxxxxxxxxxxxxxxxxxx kernel: [ 462] 0 462 6464 52 18 0 0 atd
May 13 13:46:20 xxxxxxxxxxxxxxxxxxxxxxx kernel: [ 487] 0 487 27511 32 10 0 0 agetty
May 13 13:46:20 xxxxxxxxxxxxxxxxxxxxxxx kernel: [ 488] 0 488 27511 33 11 0 0 agetty
May 13 13:46:20 xxxxxxxxxxxxxxxxxxxxxxx kernel: [ 502] 38 502 11282 188 28 0 0 ntpd
May 13 13:46:20 xxxxxxxxxxxxxxxxxxxxxxx kernel: [ 699] 0 699 26499 255 53 0 -1000 sshd
May 13 13:46:20 xxxxxxxxxxxxxxxxxxxxxxx kernel: [ 700] 0 700 140609 3183 91 0 0 tuned
May 13 13:46:20 xxxxxxxxxxxxxxxxxxxxxxx kernel: [ 3171] 0 3171 29478 157 12 0 0 wrapper
May 13 13:46:20 xxxxxxxxxxxxxxxxxxxxxxx kernel: [ 3173] 0 3173 639160 17634 93 0 0 java
May 13 13:46:20 xxxxxxxxxxxxxxxxxxxxxxx kernel: [18771] 0 18771 12428 375 22 0 0 ilogtail_0.16.1
May 13 13:46:20 xxxxxxxxxxxxxxxxxxxxxxx kernel: [18772] 0 18772 35548 1153 49 0 0 ilogtail_0.16.1
May 13 13:46:20 xxxxxxxxxxxxxxxxxxxxxxx kernel: [11045] 0 11045 8187 424 21 0 0 AliYunDunUpdate
May 13 13:46:20 xxxxxxxxxxxxxxxxxxxxxxx kernel: [11063] 0 11063 34061 2169 63 0 0 AliYunDun
May 13 13:46:20 xxxxxxxxxxxxxxxxxxxxxxx kernel: [28916] 0 28916 158555 797 165 0 0 rsyslogd
May 13 13:46:20 xxxxxxxxxxxxxxxxxxxxxxx kernel: [20779] 996 20779 24617 279 48 0 0 xxx_agentd
May 13 13:46:20 xxxxxxxxxxxxxxxxxxxxxxx kernel: [20780] 996 20780 24642 362 48 0 0 xxx_agentd
May 13 13:46:20 xxxxxxxxxxxxxxxxxxxxxxx kernel: [20781] 996 20781 24642 311 48 0 0 xxx_agentd
May 13 13:46:20 xxxxxxxxxxxxxxxxxxxxxxx kernel: [20782] 996 20782 24642 323 48 0 0 xxx_agentd
May 13 13:46:20 xxxxxxxxxxxxxxxxxxxxxxx kernel: [20783] 996 20783 24642 305 48 0 0 xxx_agentd
May 13 13:46:20 xxxxxxxxxxxxxxxxxxxxxxx kernel: [20784] 996 20784 24617 287 48 0 0 xxx_agentd
May 13 13:46:20 xxxxxxxxxxxxxxxxxxxxxxx kernel: [14212] 0 14212 37927 840 24 0 0 xxxxxx-collect
May 13 13:46:20 xxxxxxxxxxxxxxxxxxxxxxx kernel: [ 6680] 0 6680 74406 5748 60 0 0 filexxxx
May 13 13:46:20 xxxxxxxxxxxxxxxxxxxxxxx kernel: [28753] 0 28753 10448 216 13 0 0 aliyun-service
May 13 13:46:20 xxxxxxxxxxxxxxxxxxxxxxx kernel: [28775] 1000 28775 1303596 610152 1401 0 0 java
May 13 13:46:20 xxxxxxxxxxxxxxxxxxxxxxx kernel: [29015] 0 29015 36554 335 73 0 0 sshd
May 13 13:46:20 xxxxxxxxxxxxxxxxxxxxxxx kernel: [29020] 1000 29020 36554 325 71 0 0 sshd
May 13 13:46:20 xxxxxxxxxxxxxxxxxxxxxxx kernel: [29021] 1000 29021 28848 113 13 0 0 bash
May 13 13:46:20 xxxxxxxxxxxxxxxxxxxxxxx kernel: [29143] 1000 29143 288690 251514 517 0 0 vim
May 13 13:46:20 xxxxxxxxxxxxxxxxxxxxxxx kernel: Out of memory: Kill process 28775 (java) score 630 or sacrifice child ②
May 13 13:46:20 xxxxxxxxxxxxxxxxxxxxxxx kernel: Killed process 28775 (java) total-vm:5214384kB, anon-rss:2440608kB, file-rss:0kB, shmem-rss:0kB

在①处可以看到AliyunDun的一个日志,在这里触发了一个OOM Killer。然后干掉了OOM Score最高的Java 进程。

而在②处可以看到OOM Killer真实的杀死了进程ID为28775的java进程。

想了下,13::46左右,刚好打开了一个2G的日志文件,很可能是这个原因,造成AliyunDun申请内存不足,因此触发OOM Killer杀死进程。

什么是OOM Killer

OOM Killer 顾名思义即为在out of memory的情况下杀死占内存最多又不重要的进程的程序。

为什么会存在OOM Killer

这个问题源于Linux的内存申请机制,它执行的是一个积极的内存申请策略。

即:执行maclloc()返回非null的地址并不能保证内存是真实可用的(在maclloc()申请内存的机制中描述这是个非常坑的BUG),因为linux允许程序申请比系统可用空间更多的空间,该特性被称为overcommit特性,这样做是为了系统优化。并且并不是所有程序申请了内存就会立刻使用,当真正使用的时候,系统可能已经回收一部分内存。而一旦真正使用的时候发现没有内存,系统将会为了防止系统耗尽内存而触发OOM Killer,这个时候将会萱蕚性的杀死一个甚至多个进程,用于给新程序腾出空间。

linux可以选择overcommit的策略,甚至直接禁用overcommit

但是只要启用overcommit策略,那么必定会有OOM Killer。OOM Killer可以根据一些用户定义的策略来决定哪些进程需要被kill掉。它的底层通过/proc/<pid>/oom_score值来判断哪个进程需要被干掉。如下描述一些相关策略以及对应文件:

  • /proc/<pid>/oom_score_adj:用于在计算得到最终badness score的时候累加。-1000代表永远不会被kill;+1000代表最大可能被kill
  • /proc/<pid>/oom_adj:为了与旧版本操作系统兼容,写入或者读取的时候会按照比例写入读取到oom_score_adj文件;可能存在一定的精度丢失的情况发生
  • /proc/<pid>/oom_score:最终的分数,用于判断哪个进程需要被kill

OOM kill的内核实现

OOM的调用关系:

1
2
__alloc_pages_nodemask -> __alloc_pages_slowpath -> __alloc_pages_may_oom -> out_of_memory ->
__out_of_memory -> select_bad_process -> oom_kill_process

其逻辑操作如下:

  1. 首先通过函数 blocking_notifier_call_chain 遍历用户注册的通知链函数,如果通知链的callback函数能够处理OOM,则直接退出OOM killer操作。
  2. 如果引起OOM的进程(current)拥有pending SIGKILL,或者正在退出,则选择current进程。
  3. 检查发生OOM时,内核是否需要panic,如果panic,这后续代码就不会执行。
  4. 如果设置了 sysctl_oom_kill_allocating_task ,并且 current->mm 不为空,current的 oom_score_adj != OOM_SCORE_ADJ_MIN ,且可以杀死current,则直接杀死current进程,释放内存。
  5. 调用 select_bad_process 选择一个最优的进程p去杀
  6. 如果 p == null, 即没有进程可杀,内核则直接panic,否则调用函数 oom_kill_process 去kill选择选择的进程p。

select_bad_process操作:

  1. 该函数遍历所有的进程和线程 for_each_process_thread(g, p)
  2. 针对每一个线程:该函数先利用 oom_scan_process_thread 检查线程的类别,排除一些特殊的线程,然后对可以作为候选的线程进行评分。
  3. 最后返回评分最高的线程。

注意:第2步骤中,排除的线程包括:

  • init进程
  • kernel thread线程
  • 进程没有可以释放的内存
  • 不是引起oom的memcg中的进程
  • 进程正被OOM kill

如何防止进程被Kill

  • /proc/<pid>/oom_score_adj设置值为-1000
  • /proc/<pid>/oom_adj设置值为-17
  • 完全关闭OOM Killer,# sysctl -w vm.overcommit_memory=2 # echo "vm.overcommit_memory=2" >> /etc/sysctl.conf,不推荐用于生产环境。可能造成内存混乱。

总结

  • 对于服务进程,需要针对性做一些措施,防止由于某些原因造成的内存OOM情况。
  • 在日志打印上,还需要控制单日志文件的大小,超过的做分割

参考资料