1. Introduction

A process uses the heap or stack memory to store its running data. The stack memory is static and has a range of addresses set aside for that. On the other hand, heap memory is dynamic and can grow by grabbing more memory.

In an earlier article, we’ve seen what is stack and heap memory. In this tutorial, we’ll look at different ways to figure out the heap memory regions associated with a process. We’ll use a working example to have a deeper understanding of this.

2. Heap Memory

Heap memory fulfills our requirement for dynamic memory. There are situations where we don’t know how much memory we’re going to need to complete a particular task. For example, while receiving any data over the network or while reading a file, we have no idea how much memory will be needed. During these scenarios, we allocate memory dynamically from the heap.

To allocate memory dynamically, we’ve got a family of alloc functions. And, when a process calls any of these functions, such as malloc or calloc, the memory reservation happens from the heap area. Finally, when the process is done with the memory, it is released back using the free function call.

3. Memory Information

Linux stores all kernel-related information in the proc file system. Typically, process-related information is stored in a path with its PID. The /proc//maps file stores the memory-related information of the process.

Let’s try to read this file:

$ cat /proc/26769/maps 
5621ef64e000-5621ef64f000 r-xp 00000000 08:01 2903662                    /home/bluelake/heap
5621ef84e000-5621ef84f000 r--p 00000000 08:01 2903662                    /home/bluelake/heap
5621ef84f000-5621ef850000 rw-p 00001000 08:01 2903662                    /home/bluelake/heap
5621f063f000-5621f0660000 rw-p 00000000 00:00 0                          [heap]
7fc6936f2000-7fc6938d9000 r-xp 00000000 08:01 1185173                    /lib/x86_64-linux-gnu/libc-2.27.so
7fc6938d9000-7fc693ad9000 ---p 001e7000 08:01 1185173                    /lib/x86_64-linux-gnu/libc-2.27.so
7fc693ad9000-7fc693add000 r--p 001e7000 08:01 1185173                    /lib/x86_64-linux-gnu/libc-2.27.so
7fc693add000-7fc693adf000 rw-p 001eb000 08:01 1185173                    /lib/x86_64-linux-gnu/libc-2.27.so
7fc693adf000-7fc693ae3000 rw-p 00000000 00:00 0 
7fc693ae3000-7fc693b0a000 r-xp 00000000 08:01 1185145                    /lib/x86_64-linux-gnu/ld-2.27.so
7fc693cf0000-7fc693cf2000 rw-p 00000000 00:00 0 
7fc693d0a000-7fc693d0b000 r--p 00027000 08:01 1185145                    /lib/x86_64-linux-gnu/ld-2.27.so
7fc693d0b000-7fc693d0c000 rw-p 00028000 08:01 1185145                    /lib/x86_64-linux-gnu/ld-2.27.so
7fc693d0c000-7fc693d0d000 rw-p 00000000 00:00 0 
7fff17093000-7fff170b4000 rw-p 00000000 00:00 0                          [stack]
7fff17165000-7fff17168000 r--p 00000000 00:00 0                          [vvar]
7fff17168000-7fff1716a000 r-xp 00000000 00:00 0                          [vdso]
ffffffffff600000-ffffffffff601000 r-xp 00000000 00:00 0                  [vsyscall]

We can see the data is organized in different columns:

  • start memory address-end memory address
  • permissions
    • r: read permitted
    • w: write permitted
    • x: execution permitted
    • p: memory region private for the process
  • offset: the offset if the memory region is backed by a file
  • device: major and minor device number if memory is backed by a device
  • inode: the inode number of the associated file
  • path: the path to the associated file

In the output, we can see a row with the text [heap] in the path column and also some rows with missing filenames. These correspond to the dynamically allocated memory regions.

4. Working Example

Next, let’s use a C program to examine the memory regions. It allocates memory from the heap. Then, using the gdb and the pmap commands, we’ll analyze where these allocations reside in memory.

Let’s have a look at our C program, heap.c:

#include <stdio.h>
#include <malloc.h>
#include <string.h>
int main() {
    char ch;
    printf("Hit enter to allocate 1KB\n");
    ch = getchar();
    void *p1k = malloc(1024);
    printf("Copy hello1\n");
    strcpy(p1k,"hello1");

    printf("Hit enter to allocate 1GB\n");
    ch = getchar();
    void *p1g = malloc(1073741824);
    printf("Copy hello2\n");
    strcpy(p1g,"hello2");
    ch = getchar();

    free(p1g);
    free(p1k);
    printf("Memory deallocated\n");
    ch = getchar();
    return 0;
}

Looking at this code, we can see it first allocates a memory of size 1KB and copies the string ‘hello1’ into it. After that, it allocates 1GB of memory and copies ‘hello2’ into it. Finally, it frees up the allocations. It waits for user input by calling the getchar() function in between all these operations. This helps us to take a look at the changes happening in the memory.

Let’s compile and run it:

$ gcc heap.c -o heap
$ ./heap
Hit enter to allocate 1KB

With that, we’re ready to debug the process using gdb:

$ gdb heap `pgrep heap`
...
Attaching to program: /home/bluelake/heap, process 26021
...
(gdb) info proc mappings
...
          Start Addr           End Addr       Size     Offset objfile
...
      0x560095aa4000     0x560095ac5000    0x21000        0x0 [heap]

After running the info proc mappings command in gdb, we can see it shows similar output as seen from the /proc/26021/maps file. *The row that’s marked [heap], along with those without any objfile, indicate the heap regions.*

Now, let’s run the pmap command and compare the results:

$ pmap `pgrep heap`
..
0x560095aa4000    132K rw---   [ anon ]
..

Here, we get a similar output, the only difference being that the memory region is marked as anonymous instead of heap. Other than that, the start address and the size match. With this, we’ve pinpointed where the heap memory resides.

4.1. Allocating 1KB of Memory

Next, let’s hit the enter key in the terminal again:

Hit enter to allocate 1KB

Copy hello1
Hit enter to allocate 1GB

With the above output, our program should’ve allocated 1KB of memory and copied the string hello1 into it.

Now, using gdb, let’s search for that string in the heap memory region:

(gdb) find 0x560095aa4000,0x560095ac4fff,'h','e','l','l','o','1'
0x560095aa4a80
1 pattern found.
(gdb) print (char*)0x560095aa4a80
$1 = 0x560095aa4a80 "hello1"

We’ve used the find command in gdb to locate the string in the heap address range. We’ve given the start and end addresses and the string to search for.

As we can see, it has found the pattern we searched for, and we confirm this using the print command.

4.2. Allocating 1GB of Memory

Earlier, from the gdb output, we saw the size of heap memory is 132KB. We’ll take a look at what happens if we try to allocate memory more than that.

Let’s hit enter to allocate 1GB of memory and copy the text ‘hello2‘ into it:

Copy hello2

Now, let’s check the output from pmap:

00007fc6536f1000 1048580K rw---   [ anon ]

Here, we can see a newly created anonymous memory section with a size of 1GB. That means if the request is greater than the initial heap size, it grabs another chunk from the main memory.

Let’s verify the output from gdb:

          Start Addr           End Addr       Size     Offset objfile
      0x7fc6536f1000     0x7fc6936f2000 0x40001000        0x0 
(gdb) find 0x7fc6536f1000,0x7fc6936f1fff,'h','e','l','l','o','2'
0x7fc6536f1010
1 pattern found.
(gdb) p (char*) 0x7fc6536f1010
$2 = 0x7fc6536f1010 "hello2"

After running the gdb commands, we get a similar output, and it shows that we found the pattern in that address range.

Finally, let’s free all the memory we’ve allocated by pressing enter again. After that, running the pmap command doesn’t show the memory of size 1GB. We see the same with output from gdb. Press enter once more to exit the process.

With that, we’ve gone through a complete life cycle of allocating and releasing memory from the heap.

5. Conclusion

In this article, we saw how we can find the heap memory region of a process. We’ve also deeply examined the memory allocations using a sample program and compared the results between the gdb and the pmap commands.