1. Overview

In this tutorial, we’re going to see how to manage and configure core dumps. We’ll investigate kernel.core_pattern and then we’ll move onto using coredumpctl.

2. Introduction

A core dump is a file that gets automatically generated by the Linux kernel after a program crashes. This file contains the memory, register values, and the call stack of an application at the point of crashing.

3. Using Signals to Generate a Core Dump

In this section, we’ll learn how to terminate a program and force it to produce a core dump. For this, we’ll use the kill command which uses signals to terminate an application. These signals produce core dumps.

If we look at the table provided by the man page of signal, we can see a list of signals which terminate the programs with a core dump. These are signals that have an Action identified with Core:

   Signal      Standard   Action   Comment
   ────────────────────────────────────────────────────────────────────────
   SIGABRT      P1990      Core    Abort signal from abort(3)
   SIGALRM      P1990      Term    Timer signal from alarm(2)

As an example, let’s use sleep as a program that runs indefinitely:

$ sleep 500
[1] 5464
$ kill -s SIGTRAP $(pgrep sleep)
[1]+  Trace/breakpoint trap (core dumped) sleep 500

We can see the message of “core dumped” indicates a successful core dump. We also notice “Trace/breakpoint trap” which indicates the signal of SIGTRAP.

Now that we have this framework in place, let’s see how core dumps are configured.

4. Configuring Core Dumps

There are two ways to configure a core dump. One is passing the core dump via a pipe, and the other is storing it in a file.

The main configuration parameter is kernel.core_pattern. This is applicable for both file and pipe-based core dumps. In addition to this configuration parameter, file-based dumps have a size restriction on them. We can configure this size using ulimit.

We’ll cover both configuration types in the following sections.

4.1. Redirect a Core Dump to a Pipe

Let’s see how to configure our system to produce a core dump via a pipe. First, we need an example program to extract the core dump from the pipe. After that, we’ll configure the kernel to provide the program name as an argument and core dump to our program.

Let’s write a program which will only produce a core dump if the crashing process is sleep:

#!/usr/bin/python2.7
# Filename: /tmp/core_dump_example.py
import sys

# Expect sys.argv to have %e configured in kernel.core_pattern
process_filename = sys.argv[1]

if process_filename == "sleep":
    with open("/tmp/sleep_core_dump", "wb") as core_dump:
        core_contents = bytearray(sys.stdin.read())
        core_dump.write(core_contents)

Here, we notice that the program checks the first argument and only outputs a core dump if it contains sleep. Let’s store this under /tmp/core_dump_example.py and give it executable permission.

Now, we’d like the OS to invoke our script whenever it’s producing a core dump. By reading the man page for core, we can achieve this by configuring the kernel.core_patttern property with sysctl:

$ sudo sysctl -w kernel.core_pattern="|/tmp/core_dump_example.py %e"

The pipe at the beginning of the pattern indicates that the OS should pass the contents of the core dump to our script over stdin.

Notice the %e at the end. %e is a template that expands to the process name of the crashed application. There are many more templates available, described in the core man page.

Let’s try creating a core dump:

$ sleep 500 &
[1] 8828
$ kill -s SIGTRAP $(pgrep sleep)
[1]+  Trace/breakpoint trap (core dumped) sleep 500

Let’s check the signature of the file we’ve created using our python script:

$ file /tmp/sleep_core_dump
/tmp/sleep_core_dump: ELF 64-bit LSB core file, x86-64, version 1 (SYSV), SVR4-style, from 'sleep 500', real uid: 1000, effective uid: 1000, real gid: 1000, effective gid: 1000, execfn: '/usr/bin/sleep', platform: 'x86_64'

By using file on the core dump, we can immediately see that the crashing program is /usr/bin/sleep. It also shows us with other information such as the UID which started this process.

4.2. Redirect a Core Dump to a File

Following, let’s configure our system to produce a core dump file. To do this, we set kernel.core_pattern to our desired filename. Using the templates found in the core man page, we can decorate the core dump filename.

First, let’s set our core dump filename:

$ sudo sysctl -w kernel.core_pattern="/tmp/%e_core_dump.%p"

When the sleep application crashes, we would expect a file with the pattern of sleep_core_dump.pid to appear under /tmp. Where %e is the program name and %p is the program’s PID.

Note that instead of an absolute path, we could give a filename. This would create a core dump file in the current working directory of the crashing process.

Next, we need to check any limits imposed using ulimit. Core dump files have a limit set by default. These limits set by ulimit do not affect the pipe-based core dump handlers.

The unit of the core dump size is in blocks. Let’s find out how many bytes there are per block:

$ stat -fc %s .
4096

Using the 4096 bytes per block, let’s set our limit to 5 MB as we don’t expect the examples to generate core dumps greater than 5 MB. This can be calculated as nblocks = desired_limit / block_size where both desired_limit and block_size are in bytes. 5 MB is equivalent to 1280 blocks = (5 * 1024 * 1024) / 4096.

A core dump has a hard limit set to 0 by default. To set up the limits we have to add the following two lines to /etc/security/limits.conf:

baeldung_user hard core 1280
baeldung_user soft core 1280

Hard limits are system-wide limits and soft limits are user-based limits. A soft limit should be less than their corresponding hard limit. We’ll need to reboot after this.

Let’s check the size limit of core dump files after a reboot:

$ ulimit -c
1280

Great, that has taken effect. Let’s try to create a core dump:

$ sleep 500 &
[1] 9183
$ kill -s SIGTRAP $(pgrep sleep)
[1]+  Trace/breakpoint trap (core dumped) sleep 500
$ ls /tmp/*_core_*
-rw------- 1 user user 372K Jun 26 23:31 /tmp/sleep_core_dump.1780

We’ve created a core dump file with the desired pattern.

5. Generating Core Dumps for Running Processes

Sometimes it might be useful to generate a core dump for a running process. GDB can capture core dumps of a running process, but it also comes with a utility called gcore. gcore is a command-line utility that can capture the core dump of a running process.

Let’s try capturing a core dump using gcore:

$ sleep 500 &
[1] 3000
$ sudo gcore -o sleep 3000
0x00007f975eee630e in clock_nanosleep () from /lib/x86_64-linux-gnu/libc.so.6
warning: target file /proc/3000/cmdline contained unexpected null characters
warning: Memory read failed for corefile section, 4096 bytes at 0xffffffffff600000.
Saved corefile sleep.3000
[Inferior 1 (process 3000) detached]

We can see that a sleep process was started with a PID of 3000. Afterward, gcore was launched and attached itself to the sleep process. As a result, gcore then produced a core dump file of sleep.3000 and detached itself. After gcore has detached itself from a process, a process will happily continue running unaffected.

Note: gcore requires sudo to attach to a process. We could set kernel.yama.ptrace_scope to 0 using sysctl. This would allow gcore to attach to a process without sudo. However, be warned that this should be used with caution as it’s a security risk. Any process would be able to use the ptrace system call and examine any programs internals.

6. Introduction to coredumpctl

In this section, we’ll introduce a utility called coredumpctl. In contrast to manually configuring a core dump, coredumpctl automatically manages core dumps. coredumpctl records the core dumps themselves and maintains a history of crashes.

In the following sections, we’ll assume that coredumpctl is already installed on the system.

6.1. Configuring coredumpctl

coredumpctl comes with a service called systemd-coredump. This is a service that acquires the core dump, then processes it to extract metadata out of it. It then stores this information under /var/lib/systemd/coredump/.

We can check whether or not this service is configured by checking the kernel.core_pattern:

$ sysctl -n kernel.core_pattern
|/lib/systemd/systemd-coredump %P %u %g %s %t 9223372036854775808 %h

We’ve confirmed that kernel.core_pattern is set to use systemd-coredump. This tells the kernel to pass any information related to core dumps to systemd-coredump.

To try coredumpctl, we first we need to generate a new core dump:

$ sleep 500 &
[1] 2826
$ kill -s SIGTRAP $(pgrep sleep)
[1]+  Trace/breakpoint trap (core dumped) sleep 500
$ coredumpctl
TIME                            PID   UID   GID SIG COREFILE  EXE
Sun 2020-06-28 18:52:59 BST    2826  1000  1000   5 present   /usr/bin/sleep

This is really cool. By trying out coredumpctl we can see that we have a history of crashes!

6.2. Extracting a Core Dump File from History

To extract a core dump file for a specific crash we can use either a PID, the name of executable, or time of the crash. As an example, let’s attempt to save the core dump of sleep using a PID:

$ coredumpctl dump 2826 --output=core.dump
           PID: 2826 (sleep)
           UID: 1000 (user)
           ...
                Stack trace of thread 2826:
                #0  0x00007f7ec62f730e __GI___clock_nanosleep (libc.so.6 + 0xe030e)
                #1  0x00007f7ec62fceb7 __GI___nanosleep (libc.so.6 + 0xe5eb7)
           ...

In addition to the core dump file, we can see a short summary followed by a stack trace. This comes from the pre-processing of the core dump by systemd-coredump.

6.3. Running a Debug Session With coredumpctl

Let’s see how we can launch a debugging session by using the debug command:

$ coredumpctl debug 2826
...
Reading symbols from /usr/bin/sleep...
(No debugging symbols found in /usr/bin/sleep)
[New LWP 2959]
Core was generated by `sleep 500'.
Program terminated with signal SIGTRAP, Trace/breakpoint trap.
...
(gdb)

Notice how gdb was opened automatically with the core dump file loaded.

To inspect the crash, let’s type in “disassemble”. As a result, we can see the following disassembly:

Dump of assembler code for function __GI___clock_nanosleep:
<__GI___clock_nanosleep+80>
   0x00007fb71b22c307 <+39>:    mov    eax,0xe6
   0x00007fb71b22c30c <+44>:    syscall
=> 0x00007fb71b22c30e <+46>:    mov    edx,eax

We can see, there’s “mov wax, 0xe6″ followed by a syscall instruction. Looking at the list of syscalls, it seems that 230 (0xe6) is the clock_nanosleep syscall. This is the point where the core dump was captured.

7. Conclusion

In this tutorial, we’ve explored how to configure core dumps. Later on, we explored the utility coredumpctl, which makes it much easier to manage core dumps.