1. Introduction

Debugging is part of development. It’s usually done in an integrated development environment (IDE). However, post-release debugging is often also convenient, like when using third-party modules.

In this tutorial, we’ll start by discussing what debugging is. Next, we consider debugger requirements and functionality. After that, we dive into the GNU Project Debugger along with some of its basic options. Finally, we include extra information, helpful during specific debugging sessions.

We tested the code in this tutorial on Debian 11 (Bullseye) with GNU Bash 5.1.4 and GNU Project Debugger 10.1. It is POSIX-compliant and should work in any such environment.

2. Debugging

What we usually mean by debugging is removing problems with code. Broadly speaking, problems or errors can be in syntax, logic, or execution. The first type is caught during parsing or compilation, while the others are runtime issues. Examples of each error type in order are missed closing brackets, infinite loops, and wrong file paths.

Alternatively, debugging can also help with understanding and optimizing code paths. For example, there are loop, function call, memory, and other optimizations. This helps make execution faster and more efficient. For instance, debuggers can reveal the types of data that best fit a given program.

Importantly, both error checking and optimization can be done with many tools.

3. Debuggers

Debuggers are programs for debugging as described above. The debugger choice can depend on the language, but they allow:

  • loading a target program in certain conditions
  • running and stopping the target based on specified rules
  • display stack frames and other data around the current operation
  • modify code on the go

For fine control over all of the above, it’s best we have a full debugging symbols table.

3.1. Symbol Tables

Executable files have what’s called a symbol table. It contains addresses of the most important parts of the code. For example, in C programs, one such part is the main function – the program’s entry point.

In addition to the default system symbol table, compiled files can have a so-called debugging symbols table. Its role is to provide extra information for debuggers. The difference between these two tables is outside the scope of the current article. However, as will become clear, it’s much more convenient, if not necessary, to have both.

3.2. Breakpoints

Perhaps one of the most defining and useful functions of debuggers is the ability to execute code step by step when needed. One of the methods to achieve this is with breakpoints. Breakpoints represent places in the code, where execution should stop and give up control to the debugger.

Specifically, we set the breakpoints before or during the target run. A breakpoint can relate to a file, line of code, the beginning of a function, an address, or other specific conditions.

This is the first place where a dedicated debugging symbols table comes in handy. It allows us to break the target into lines and objects of code in the original language. Alternatively, we would have to go through the code in its original Assembly format.

3.3. Monitoring

Of course, after halting execution with a breakpoint, we’d probably like to see what the current state is. What this might mean is going through the chain of function calls (backtrace), as well as the current and surrounding environments (stack frames).

Another related powerful option with debuggers is to watch. It allows us to monitor how the program run modifies variables and state.

3.4. Modification

Finally, any changes we decide on can often be directly applied to the code. In practice, this means rewriting the source and being able to see the immediate results.

Of course, many languages come with an IDE, which includes a debugger. However, in addition, there are also external debugging programs.

4. The GNU Project Debugger

Probably the most famous third-party tool for post-release debugging is gdb (GNU Project Debugger) from the GNU binutils package.

Although gdb works with many languages (12 at the time of writing), we’ll use C as a base for our examples. In particular, we’ll work with the C source target.c:

01 int inc(int a) {
02   return a+1;
03 }
04
05 int main(int argc, char** argv) {
06   for(int i=1; i < 5; i++) {
07     int a = 1;
08     a = inc(a);
09   }
10
11   return argc;
12 }

Let’s go through an example debugging session.

4.1. Compilation and Loading

Since we’re going to do post-release debugging, we should first compile our example. To that end, we’ll use gcc (GNU C Compiler). To make full use of GDB, it’s best we compile with the -g or -ggdb flags to gcc. Either ensures we generate a debugging symbols table suitable for GDB:

gcc -ggdb target.c -o target.o

Next, we load the target in gdb:

gdb target.o

Once the target program target.o is loaded, we can do some exploring.

4.2. Source Code

Being able to see the source code in the original language is vital while debugging. The most basic way of doing that is the list command:

(gdb) list 1,3
1       int inc(int a) {
2         return a+1;
3       }

Here we list the first three lines of code from our file. The list command allows us to specify files, lines, functions, and addresses.

Note that repeatedly pressing return after entering any command line most often repeats that command line. In some special cases, it acts a little differently. For example, with list, repeatedly pressing return discards the arguments to the command.

Alternatively, we can use the text user interface (TUI). This GDB mode allows for:

  • mouse support
  • command bindings
  • single key shortcuts
  • arguably more convenient data display

In particular, the source code with the currently executed line and any set breakpoints are all available at a glance. To enter TUI, we just start gdb with the -tui flag or type tui enable at the (gdb) prompt. All commands we discuss are available in both TUI and normal mode.

4.3. Breakpoints

If we attempt to directly run our target in GDB, something like this would be the result:

(gdb) run
Starting program: /target.o
[Inferior 1 (process 666) exited with code 01]

Between the program starting and completing, there are no input and output or any possibility to interact with its process. This leaves us with few options apart from browsing the source code and passing arguments.

Let’s try to halt execution at the first call to inc with the break command:

(gdb) break inc
Breakpoint 1 at 0x112c: file target.c, line 2.
(gdb) run
Starting program: /target.o

Breakpoint 1, inc (a=1) at target.c:2
2         return a+1;

We just set a breakpoint. Breakpoints are places where the target should pause and give up control to the debugger. Of course, we can also delete breakpoints via delete. Without any arguments, delete removes all breakpoints. While the clear command has a similar function, it can specify files, functions, and line numbers to remove breakpoints from.

We are in control and halted on a breakpoint at the first line of inc. What now?

4.4. Information

Once execution halts, we often want to see what’s happening. The current code line is visible, but we might also want to see some context with list or directly in TUI mode.

Apart from source code, we could be interested in the call stack. To show the chain of function calls, we use backtrace:

(gdb) backtrace
#0  inc (a=1) at target.c:2
#1  0x0000555555555156 in main (argc=1, argv=0x7fffffffe5f8) at target.c:8

We see inc was called at line 8 of target.c. To show more information about a frame, we have several commands at our disposal:

(gdb) frame
#0  inc (a=1) at target.c:2
2         return a+1;
(gdb) info frame
Stack level 0, frame at 0x7fffffffe4f0:
 rip = 0x55555555512c in inc (target.c:2); saved rip = 0x555555555156
 called by frame at 0x7fffffffe510
 source language c.
 Arglist at 0x7fffffffe4e0, args: a=1
 Locals at 0x7fffffffe4e0, Previous frame's sp is 0x7fffffffe4f0
 Saved registers:
  rbp at 0x7fffffffe4e0, rip at 0x7fffffffe4e8

The frame command shows the last line from the current frame, as well as the function it belongs to. Furthermore, the info frame command displays verbose information about the current frame. Both frame and info frame accepts a frame number as their last argument.

Actually, info is a very versatile command. It comes in handy for both internal GDB values and execution information.
For example, it can show us local variables via info locals, but we don’t have any at this point.

For showing particular object values and expression evaluations, we can also use print:

(gdb) print a
$1 = 1
(gdb) print a+666
$2 = 667
(gdb) print/x a+666
$3 = 0x29b

The x command works similarly but shows the contents of a memory address. Note that we can apply formats after a slash and use expressions as the arguments of both print and x.

Let’s rewind a bit now.

4.5. Restart

Importantly, we can exit GDB with quit or just stop the current run with kill:

(gdb) kill
Kill the program being debugged? (y or n) y
[Inferior 1 (process 666) killed]

In addition, to start debugging with a temporary breakpoint in the very beginning, we use the start command:

(gdb) start
Temporary breakpoint 1 at 0x113c: file target.c, line 6.
Starting program: /target.o

Temporary breakpoint 1, main (argc=1, argv=0x7fffffffe5f8) at target.c:6
6         for(int i=1; i < 5; i++) {

Next, we might also want to configure future monitoring to keep track of and display values.

4.6. Watch and Display

If we enter TUI mode now, we see the whole source with breakpoints and the current line marked:

┌─target.c────────────────[...]
│   1           int inc(int a) {
│   2             return a+1;
│   3           }
│   4
│   5           int main(int argc, char** argv) {
│B+>6             for(int i=1; i < 5; i++) {
│   7               int a = 1;
│   8               a = inc(a);
│   9             }
│   10
│   11            return argc;
│   12          }
[...]

Since it’s a driver of the loop, let’s say variable i interests us. To monitor a particular object for changes, we use the watch command:

(gdb) watch i
Hardware watchpoint 2: i

When monitoring, any change to the object will act as a form of automatic breakpoint (or step). Alternatively, we can use rwatch or awatch to monitor for only reads of a particular object or both reads and changes.

In addition, we can show some information on each step via display:

(gdb) display i
1: i = 0
(gdb) display/x i
2: i = 0x0

The display command also supports the slash format specification and expressions as arguments.

Importantly, rows are added to each display call. To remove one, we can use undisplay with the row number from above as an argument:

(gdb) display
1: i = 0
2: /x i = 0x0
(gdb) undisplay 2
(gdb) display
1: i = 0

Having completed the checks and configured monitoring, let’s continue our controlled execution.

4.7. Stepping

To resume after a stop, we have multiple commands at our disposal. The first one we’ll look at is continue:

(gdb) continue
Continuing.

Hardware watchpoint 2: i

Old value = 0
New value = 1
main (argc=1, argv=0x7fffffffe5f8) at target.c:6
6         for(int i=1; i < 5; i++) {
1: i = 1

In general, continue just resumes execution. Here, it proceeds until the next halt due to the watchpoint on i.

Note that our display value also shows up at the bottom. Let’s remove our watchpoints and displays to declutter:

(gdb) undisplay 1
(gdb)

points
Num     Type           Disp Enb Address            What
2       hw watchpoint  keep y                      i
        breakpoint already hit 1 time
(gdb) delete 2

In contrast to continue, step and next act as if a breakpoint is set on the next source line. The difference between them is that next skips over function calls, while step proceeds inside the called function with its stack frame:

(gdb) next
7           int a = 1;
(gdb) next
8           a = inc(a);
(gdb) next
6         for(int i=1; i < 5; i++) {
(gdb) next
7           int a = 1;
(gdb) next
8           a = inc(a);
(gdb) step
inc (a=1) at target.c:2
2         return a+1;

At line a = inc(a), we next back to the for loop evaluation, but we step into the inc function. In other words, we are entering the next stack frame with step as opposed to skimming through the current stack frame with next.

Importantly, continue, next, and step accept a number as their argument. For continue, it signifies the number of halts (breakpoints, watchpoints, etc.) to ignore and not stop at. For next and step, the number is just a repeat count – it simulates pressing return that many times.

Even from this relatively short walkthrough, it is obvious that we can easily get lost in GDB. There is a mechanism that helps with such occasions.

4.8. Checkpoints

Just like insurance, we can save the state of a debugging session at a given point. This is done via the checkpoint command, which forks the current target and suspends that fork:

(gdb) checkpoint
checkpoint 1: fork returned pid 666.
(gdb) next
3       }
(gdb) next
main (argc=1, argv=0x7fffffffe5f8) at target.c:6
6         for(int i=1; i < 5; i++) {
(gdb) checkpoint
checkpoint 2: fork returned pid 667.

In the snippet above, we initially create the first checkpoint with PID 666. After that, we take a couple of steps with next and create a second checkpoint with PID 667. Importantly, despite not sharing any data, both processes have the same address allocation.

Now we are ready to restore to checkpoint 1:

(gdb) restart 1
Switching to process 666
#0  inc (a=1) at target.c:2
2         return a+1;

Information about the checkpoint state includes the current file, function, and line. Next, we ensure all checkpoints are still available:

(gdb) info checkpoints
  0 process 660 (main process) at 0x555555555160, file target.c, line 6
* 1 process 666 at 0x55555555512c, file target.c, line 2
  2 process 667 at 0x555555555160, file target.c, line 6

The asterisk points to the current checkpoint, along with its process, address, file, and line.

After diving into the basic functions of GDB, let’s see some additional specific points, which may be of use.

5. Extras

In this section, we cover some potentially useful GDB specifics. For brevity, we use the code snippet from the previous section.

5.1. Help

With so many options, gdb can be very formidable. Consequently, this mini subsection is devoted to help. The help command is a light in the vast dark forest that are gdb and debugging in general. While help does not represent a tutorial, it’s our best ally when using the program.

This is especially important when using GDB without a debugging symbol table.

5.2. Disassembly

When starting gdb, if we had not used one of the -g flags to gcc during compilation, we would have received a warning: No debugging symbols found in target.o. Any next operations that depend on the debugging symbols table would have prompted us to load one.

Furthermore, without this table, we can only debug in machine code instructions. Machine instruction debugging is universal, but usually a last-resort method. To see the Assembly equivalent of our C code, we can use the disassemble command in GDB. For simplicity, let’s apply it only to our inc function:

(gdb) disassemble inc
Dump of assembler code for function inc:
   0x0000000000001125 <+0>:     push   %rbp
   0x0000000000001126 <+1>:     mov    %rsp,%rbp
   0x0000000000001129 <+4>:     mov    %edi,-0x4(%rbp)
   0x000000000000112c <+7>:     mov    -0x4(%rbp),%eax
   0x000000000000112f <+10>:    add    $0x1,%eax
   0x0000000000001132 <+13>:    pop    %rbp
   0x0000000000001133 <+14>:    ret
End of assembler dump.

This is the code of the inc function in Assembly. If we just run disassemble regardless of the conditions, GDB shows a context of 3 instructions around the current one disassembled. When no debugging symbols are available, we would have to extract meaning out of lines like the above. To be sure, they are the machine instructions behind the C code.

To show the current instruction disassembled at each step, we can use display:

(gdb) start
[...]
6         for(int i=1; i < 5; i++) {
(gdb) display/i $pc
1: x/i $pc
=> 0x555555555143 <main+15>:    movl   $0x1,-0x4(%rbp)

Note the format specifier /i means instruction is being output, while $pc is the program counter, which stores the current instruction address.

Of course, when debugging machine instructions, we don’t have many of the comforts of higher-level languages. Among many others, this includes access to objects and variables by their original name.

However, we can still step through the code, but only by instructions. The instructions for the purpose are similar to the ones we already looked at – starti, stepi, nexti. The machine code commands add the i (i.e. instruction) suffix. Otherwise, their function is more or less equivalent.

5.3. Code Modification

Importantly, by default, code modifications can happen only by address and in binary form. GDB does not have an Assembler or any compiler built-in. This means that in order to change code, we have to modify machine instructions, rewriting directly:

(gdb) start
[...]
6         for(int i=1; i < 5; i++) {
(gdb) x/i $pc
=> 0x555555555143 <main+15>:    movl   $0x1,-0x4(%rbp)
(gdb) set *(unsigned char*)0x555555555143 = 0x90
(gdb) x/i $pc
=> 0x555555555143 <main+15>:    nop

Here, we specify the address of the instruction and assign the value 0x90 (the opcode for the nop instruction). Indeed, we should be very careful with such modifications, as they are very precise and can easily break the code.

5.4. Arguments

Many target programs will have command-line arguments. To add those, we can pass the arguments to run or use the set command:

(gdb) run
Starting program: /target.o
[Inferior 1 (process 665) exited with code 01]
(gdb) run 1 2
Starting program: /target.o 1 2
[Inferior 1 (process 666) exited with code 03]
(gdb) set args 1 2
(gdb) run
Starting program: /target.o 1 2
[Inferior 1 (process 666) exited with code 03]

Notice how the command line and exit codes change, because our sample source returns the number of arguments as its status. To check whether we have set any arguments, we can use show args.

5.5. Advanced Stepping

Let’s briefly discuss two additional stepping commands – until and finish. We can run through to a given line with until:

(gdb) start
[...]
6         for(int i=1; i < 5; i++) {
(gdb) until 9
main (argc=1, argv=0x7fffffffe5f8) at target.c:11
11        return argc;

Notice how we jumped through the for loop and directly to the return statement.

Similarly, we can use finish to jump, but this time outside the current function, by forcing GDB to run it to its end:

(gdb) break inc
Breakpoint 1 at 0x112c: file target.c, line 2.
(gdb) run
[...]
2         return a+1;
(gdb) finish
Run till exit from #0  inc (a=1) at target.c:2
0x000055555555515d in main (argc=1, argv=0x7fffffffe5f8) at target.c:8
8           a = inc(a);
Value returned is $1 = 2

Both until and finish are like special cases of continue.

5.6. Remote Debugging

Finally, we devoted this mini subsection to a very powerful GDB function – remote debugging. Remote debugging allows gdb to run on one machine, while its target runs on another… with a potentially different platform.

The way we do this is via something called a remote stub, which allows us control over a remote target. The default remote stub for GDB is gdbserver. Configuring remote debugging is outside the scope of this article, but suffice to say it’s an invaluable function.

6. Summary

In this tutorial, we explored debugging with the GNU Project Debugger. First, we showed how to compile a potential target and explore its source code. After that, we stepped through the code with gdb and displayed internal information. Next, we discussed checkpoints and some advanced GDB features.

In conclusion, we can say that GDB is a versatile tool with multiple options allowing fine control over debug operations.