寫一個 segmentation fault 的範例程式。
在 Ubuntu 預設是不會產生 core dump 的,想讓系統產生 core dump ,需要輸入以下指令:
$ ulimit -c unlimited
其中 ulimit 是限制一些 user 資源的使用量,包含 max user processes 、 open files 的上限、 virtual memory 的上限等,而 ulimit -c
便是設定 core 的大小, unlimited
代表無上限,如果想要限制 core 的大小,可以把 unlimited
改成其他數字。
接下來我寫了一個測試檔案,執行時會造成 segmentation fault:
$ cat myprogram.c
#include <stdio.h>
int main() {
int *i;
*i = 1; // Assign without malloc
return 0;
}
編譯時加入 -g
debug option ,然後執行:
$ gcc -g myprogram.c -o myprogram
$ ./myprogram
Segmentation fault (core dumped)
此時在當前目錄應該會遺留一個檔名為 core
的檔案,見方法一。若未發現該檔案,代表 Linux 版本較新,可能就不走這個方式,而是用 coredumpctl
來統一管理 core dump,見方法二。
方法一
這時候我們想找出程式到底哪裡出錯,於是打開 gdb ,並輸入程式名稱和 core 的檔名:
$ gdb myprogram core
GNU gdb (Ubuntu 9.1-0ubuntu1) 9.1
Copyright (C) 2020 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.
For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from myprogram...
[New LWP 3355]
Core was generated by `./myprogram'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0 0x000056139bf2f135 in main () at myprogram.c:4
4 *i = 1;
(gdb) q
$
方法二
安裝 systemd-coredumpctl
:
$ sudo apt install systemd-coredump
接著用 coredumpctl 列出 myprogram
這個程式的 core dump 紀錄:
TIME PID UID GID SIG COREFILE EXE SIZE
Sun 2022-10-02 19:52:44 CST 61547 1000 1000 SIGSEGV present /home/lin/Desktop/test/c/myprogram 18.1K
然後用 coredumpctl debug myprogram
讓 gdb 打開 core dump:
$ coredumpctl debug myprogram
PID: 61547 (myprogram)
UID: 1000 (lin)
GID: 1000 (lin)
Signal: 11 (SEGV)
Timestamp: Sun 2022-10-02 19:52:44 CST (1min 0s ago)
Command Line: ./myprogram
Executable: /home/lin/Desktop/test/c/myprogram
Control Group: /user.slice/user-1000.slice/user@1000.service/app.slice/app-org.gnome.Terminal.slice/vte-spawn-f9eab066-b8f1-4ae9-8e21-14c78f25dbe0.scope
Unit: user@1000.service
User Unit: vte-spawn-f9eab066-b8f1-4ae9-8e21-14c78f25dbe0.scope
Slice: user-1000.slice
Owner UID: 1000 (lin)
Boot ID: cbadc919c16149c8ae41b0a2ff2a94d4
Machine ID: bd90775ca2c24d75929026d99304a7c2
Hostname: Aspire7
Storage: /var/lib/systemd/coredump/core.myprogram.1000.cbadc919c16149c8ae41b0a2ff2a94d4.61547.1664711564000000.zst (present)
Disk Size: 18.1K
Message: Process 61547 (myprogram) of user 1000 dumped core.
Found module /home/lin/Desktop/test/c/myprogram with build-id: 95d32ba05adbce1c07d8f4a8c4eff5acf5a96cf8
Found module linux-vdso.so.1 with build-id: 88536f03982667a41f2be85c4e61e7e8307e2700
Found module ld-linux-x86-64.so.2 with build-id: 61ef896a699bb1c2e4e231642b2e1688b2f1a61e
Found module libc.so.6 with build-id: 69389d485a9793dbe873f0ea2c93e02efaa9aa3d
Stack trace of thread 61547:
#0 0x000055983cee5135 n/a (/home/lin/Desktop/test/c/myprogram + 0x1135)
#1 0x00007fc884842d90 __libc_start_call_main (libc.so.6 + 0x29d90)
#2 0x00007fc884842e40 __libc_start_main_impl (libc.so.6 + 0x29e40)
#3 0x000055983cee5065 n/a (/home/lin/Desktop/test/c/myprogram + 0x1065)
GNU gdb (Ubuntu 12.0.90-0ubuntu1) 12.0.90
Copyright (C) 2022 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<https://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.
For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from /home/lin/Desktop/test/c/myprogram...
[New LWP 61547]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Core was generated by `./myprogram'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0 0x000055983cee5135 in main () at myprogram.c:4
4 *i = 1;
(gdb) q
$
使用 bt 命令來進行 back trace
gdb 很順利的幫我找到了 myprogram.c
的 main()
的第 4 行出錯了,開心~
如果程式比較複雜,呼叫了很多層函式,這時可以使用 bt
(back trace) 一層一層找出一連串函式呼叫,例如以下範例:
$ cat myprogram.c
#include <stdio.h>
void func(){
int *a;
*a = 1;
}
int main() {
func();
return 0;
}
出錯的地點在 func()
中,在 gdb 中使用 bt
來查看:
Open core dump by gdb or coredumpctl.
. . .
Program terminated with signal SIGSEGV, Segmentation fault.
#0 0x000055ab058f0135 in func () at myprogram.c:4
(gdb) bt
#0 0x000055ab058f0135 in func () at myprogram.c:4
#1 0x000055ab058f0150 in main () at myprogram.c:8
可以看到 gdb 追朔到呼叫 func()
的地方,也就是 main.c
的第 8 行。