c c++

如何使用 gdb 查看 core dump

Posted by blueskyson on December 9, 2020

寫一個 segmentation fault 的範例程式。

在 Ubuntu 預設是不會產生 core dump 的,想讓系統產生 core dump ,需要輸入以下指令:

$ ulimit -c unlimited

其中 ulimit 是限制一些 user 資源的使用量,包含 max user processes 、 open files 的上限、 virtual memory 的上限等,而 ulimit -c 便是設定 core 的大小, unlimited 代表無上限,如果想要限制 core 的大小,可以把 unlimited 改成其他數字。

接下來我寫了一個測試檔案,執行時會造成 segmentation fault:

$ cat myprogram.c
#include <stdio.h>
int main() {
    int *i;
    *i = 1;     // Assign without malloc
    return 0;
}

編譯時加入 -g debug option ,然後執行:

$ gcc -g myprogram.c -o myprogram
$ ./myprogram
Segmentation fault (core dumped)

此時在當前目錄應該會遺留一個檔名為 core 的檔案,見方法一。若未發現該檔案,代表 Linux 版本較新,可能就不走這個方式,而是用 coredumpctl 來統一管理 core dump,見方法二

方法一

這時候我們想找出程式到底哪裡出錯,於是打開 gdb ,並輸入程式名稱和 core 的檔名:

$ gdb myprogram core
GNU gdb (Ubuntu 9.1-0ubuntu1) 9.1
Copyright (C) 2020 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
    <http://www.gnu.org/software/gdb/documentation/>.

For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from myprogram...
[New LWP 3355]
Core was generated by `./myprogram'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0  0x000056139bf2f135 in main () at myprogram.c:4
4		*i = 1;
(gdb) q
$

方法二

安裝 systemd-coredumpctl

$ sudo apt install systemd-coredump

接著用 coredumpctl 列出 myprogram 這個程式的 core dump 紀錄:

TIME                          PID  UID  GID SIG     COREFILE EXE                                SIZE
Sun 2022-10-02 19:52:44 CST 61547 1000 1000 SIGSEGV present  /home/lin/Desktop/test/c/myprogram 18.1K

然後用 coredumpctl debug myprogram 讓 gdb 打開 core dump:

$ coredumpctl debug myprogram
           PID: 61547 (myprogram)
           UID: 1000 (lin)
           GID: 1000 (lin)
        Signal: 11 (SEGV)
     Timestamp: Sun 2022-10-02 19:52:44 CST (1min 0s ago)
  Command Line: ./myprogram
    Executable: /home/lin/Desktop/test/c/myprogram
 Control Group: /user.slice/user-1000.slice/user@1000.service/app.slice/app-org.gnome.Terminal.slice/vte-spawn-f9eab066-b8f1-4ae9-8e21-14c78f25dbe0.scope
          Unit: user@1000.service
     User Unit: vte-spawn-f9eab066-b8f1-4ae9-8e21-14c78f25dbe0.scope
         Slice: user-1000.slice
     Owner UID: 1000 (lin)
       Boot ID: cbadc919c16149c8ae41b0a2ff2a94d4
    Machine ID: bd90775ca2c24d75929026d99304a7c2
      Hostname: Aspire7
       Storage: /var/lib/systemd/coredump/core.myprogram.1000.cbadc919c16149c8ae41b0a2ff2a94d4.61547.1664711564000000.zst (present)
     Disk Size: 18.1K
       Message: Process 61547 (myprogram) of user 1000 dumped core.
                
                Found module /home/lin/Desktop/test/c/myprogram with build-id: 95d32ba05adbce1c07d8f4a8c4eff5acf5a96cf8
                Found module linux-vdso.so.1 with build-id: 88536f03982667a41f2be85c4e61e7e8307e2700
                Found module ld-linux-x86-64.so.2 with build-id: 61ef896a699bb1c2e4e231642b2e1688b2f1a61e
                Found module libc.so.6 with build-id: 69389d485a9793dbe873f0ea2c93e02efaa9aa3d
                Stack trace of thread 61547:
                #0  0x000055983cee5135 n/a (/home/lin/Desktop/test/c/myprogram + 0x1135)
                #1  0x00007fc884842d90 __libc_start_call_main (libc.so.6 + 0x29d90)
                #2  0x00007fc884842e40 __libc_start_main_impl (libc.so.6 + 0x29e40)
                #3  0x000055983cee5065 n/a (/home/lin/Desktop/test/c/myprogram + 0x1065)

GNU gdb (Ubuntu 12.0.90-0ubuntu1) 12.0.90
Copyright (C) 2022 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<https://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
    <http://www.gnu.org/software/gdb/documentation/>.

For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from /home/lin/Desktop/test/c/myprogram...
[New LWP 61547]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Core was generated by `./myprogram'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0  0x000055983cee5135 in main () at myprogram.c:4
4	    *i = 1;
(gdb) q
$

使用 bt 命令來進行 back trace

gdb 很順利的幫我找到了 myprogram.cmain() 的第 4 行出錯了,開心~ 如果程式比較複雜,呼叫了很多層函式,這時可以使用 bt (back trace) 一層一層找出一連串函式呼叫,例如以下範例:

$ cat myprogram.c
#include <stdio.h>
void func(){
    int *a;
    *a = 1;
}

int main() {
    func();
    return 0;
}

出錯的地點在 func() 中,在 gdb 中使用 bt 來查看:

Open core dump by gdb or coredumpctl.

 . . .

Program terminated with signal SIGSEGV, Segmentation fault.
#0  0x000055ab058f0135 in func () at myprogram.c:4
(gdb) bt
#0  0x000055ab058f0135 in func () at myprogram.c:4
#1  0x000055ab058f0150 in main () at myprogram.c:8

可以看到 gdb 追朔到呼叫 func() 的地方,也就是 main.c 的第 8 行。