Linux进程的峰值内存使用量

1. Overview

Linux, by its design, aims to use all of the available physical memory as efficiently as possible. But, at times, system resource limitations can cause abrupt behavior on the server-side. Usually, these limitations trigger high CPU and high memory usage. In any case, we could really avoid such abrupt behavior by observing the memory utilization of a process.

In this tutorial, we’ll learn to implement a few tips and tricks using some well-known Linux commands, to help us recognize the peak memory usage of a process.

2. Traditional Commands to Monitor Memory

For the most part, commands like top**/htop/atop give us the processes overview. In specific cases, they may also be used to monitor a particular process. Here, we’re focusing on checking a process to identify its peak memory utilization.

We can start by investigating the top result to see the process overview. This gives an idea regarding what all processes are using.

Let’s say we need to zero in on a specific process, and we know its process id (PID):

$ top -p 7
top - 10:25:53 up 19 min,  0 users,  load average: 0.52, 0.58, 0.59
Tasks:   1 total,   0 running,   1 sleeping,   0 stopped,   0 zombie
%Cpu(s): 16.0 us,  6.2 sy,  0.0 ni, 77.1 id,  0.0 wa,  0.7 hi,  0.0 si,  0.0 st
MiB Mem :   3961.1 total,    665.1 free,   3072.0 used,    224.0 buff/cache
MiB Swap:  12288.0 total,  11806.3 free,    481.7 used.    758.5 avail Mem
  PID USER      PR  NI    VIRT    RES    SHR S  %CPU  %MEM     TIME+ COMMAND
    7 user1     20   0   18076   3632   3528 S   0.0   0.1   0:00.19 bash

Now we’ll look at bash usage:

$ top | grep bash
    7 user1     20   0   18076   3632   3528 S   0.0   0.1   0:00.19 bash
   73 root      20   0   18076   3604   3540 S   0.0   0.1   0:00.17 bash

htop is similar to top, but shows more data about a process. It has the command column, which is convenient to recognize the process path.

atop is again a command like top and htop. Its advantage is the very useful feature of recording the output in a file.

Consider an issue that happens once and again at a particular time window. To keep a record, we can schedule a cron job to write the output to a file, so it becomes feasible to play back later. We’ll use atop -w if we need to record the output to a file:

$ atop -w filename

And we’ll use atop -r if we need to playback the output from that file:

$ atop -r filename
  PID      SYSCPU       USRCPU        VGROW        RGROW       RUID           EUID           ST       EXC        THR       S       CPUNR        CPU       CMD        1/1   73       0.17s        0.03s       481.1G        3604K       root           root           N-         -          1       S           0         0%       bash
    7       0.10s        0.09s       591.0G        3632K       user1          user1          N-         -          1       S           0         0%       bash
    1       0.15s        0.00s       376.9G         316K       root           root           N-         -          2       S           0         0%       init
   71       0.04s        0.00s       716.4G        2908K       root           root           N-         -          1       S           0         0%       sudo
   72       0.03s        0.00s       820.0G        2092K       root           root           N-         -          1       S           0         0%       su
  109       0.01s        0.00s         1.0T        2136K       root           root           N-         -          1       R           0         0%       atop
    6       0.00s        0.01s       376.9G         224K       root           root           N-         -          1       S           0         0%       init

These three commands can be the best tools for a continuous investigation. For our situation, we can identify the peak memory usage of a process before it arrives at the set limit.

3. grep One-Liner

The /proc virtual filesystem is a directory containing the hierarchy of files that represent the current state of a Linux kernel. It also includes information on any currently running processes.

Here’s a one-liner that determines the peak memory usage of one such process having the process id (PID) 113:

$ grep ^VmPeak /proc/113/status 
VmPeak: 2252 kB

We can also look for “VmHWM: Peak resident set size” to measure RAM usage. VmPeak is the maximum total memory usage, including virtual memory, while VmHWM is the peak RAM usage.

As we know, /proc is a virtual filesystem, so reading from its files is not the same as reading from a normal filesystem. The information about a process is removed from /proc much faster than if it was a real filesystem (dirty cache flushing involved here).

With this in mind, imagine that we need to read the next line of a process that isn’t buffered yet. In this case, the information about it may have already been removed. We may not need information about a process that no longer exists. The solution is to either account for file loss or buffer the entire file and then parse it.

4. GNU time

Let’s understand the GNU time in the given context by going through a few examples.

Suppose we want to know the peak memory usage of the ‘top‘ process. In this case, it’s the “Maximum resident set size” that tells us so:

$ /usr/bin/time -v top | grep "Maximum resident set size"
Maximum resident set size (kbytes): 2252

GNU time supports the format option. Apart from memory usage, GNU time with the %P option provides unrelated statistics (%CPU), which depends on the scheduler and is therefore quite variable:

$ /usr/bin/time -f "%P %M" top
2% 2248

In bash, we need to specify the full path, such as /usr/bin/time, because the bash built-in time keyword doesn’t support the -f option:

$ /usr/bin/time -f '%M' top 
2248

We can also create an alias or adjust the environment to use GNU time for average and maximum memory information:

alias time="$(which time) -f '\t%E real,\t%U user,\t%S sys,\t%K amem,\t%M mmem'"
export TIME="$(which time) -f '\t%E real,\t%U user,\t%S sys,\t%K amem,\t%M mmem'"

This information stored in memory segments represents the average total (data+stack+text) memory use (K), and the maximum resident set size (M) of the process.

4.1. Arguments and Considerations

Let’s do a time command fact-check by discussing some of its known issues.

The first issue is that time may be broken for memory reporting on some Linux systems. When we determine “Maximum resident set size” using /usr/bin/time -v ls, we can see that most of the time, it returns a 0. It always returns 0 because time derives most information from the wait3(2) system call.

Systems that don’t have a wait3(2) call, use the time(2) system call instead. However, it provides much less information than wait3(2). So, such systems report the majority of the resources as 0.

Alternatively, trying a more CPU-intensive command may do the trick.

The second issue is that upon calling time -v, the output “bash: -v: command not found” means bash intercepts time to use its own built-in time function. Using /bin/time -v solves the problem.

Thirdly, GNU time has a bug. It reports 4x the actual memory usage. time 1.7-24 on Ubuntu 14.4 and Fedora’s time package from version 1.7-3 incorporates a fix to the memory reporting.

4.2. Built-in time vs. GNU time

As we noted above, in bash, we need to specify the full path to GNU time, for example, /usr/bin/time. Alternatively, we can use command time -l.

It’s important to note that the command is not a placeholder. The command time is different than just time. The command time -l invokes the shell to call a binary called time instead of the built-in function.

5. Valgrind One-Liner

Valgrind is a framework that provides instrumentation to user-space binaries. It ships with several tools that can be used to profile and analyze program performance.

Massif, one of the Valgrind tools, measures the heap memory used by a specific program. A simple valgrind massif one-liner to depict peak memory usage of the ‘top‘ process will look like:

$ valgrind --tool=massif --pages-as-heap=yes --massif-out-file=massif.out top; grep mem_heap_B massif.out | sed -e 's/mem_heap_B=\(.*\)/\1/' | sort -g | tail -n 1
==746== Massif, a heap profiler
==746== Copyright (C) 2003-2017, and GNU GPL'd, by Nicholas Nethercote
==746== Using Valgrind-3.15.0 and LibVEX; rerun with -h for copyright info
==746== Command: top
==746==
==746== error calling PR_SET_PTRACER, vgdb might block
top - 21:47:03 up 23 min,  0 users,  load average: 0.52, 0.58, 0.59
Tasks:   8 total,   1 running,   6 sleeping,   1 stopped,   0 zombie
%Cpu(s):  2.8 us,  1.9 sy,  0.0 ni, 95.3 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
MiB Mem :   3961.1 total,    408.3 free,   3328.8 used,    224.0 buff/cache
MiB Swap:  12288.0 total,  12014.8 free,    273.2 used.    501.7 avail Mem

  PID USER      PR  NI    VIRT    RES    SHR S  %CPU  %MEM     TIME+ COMMAND
    1 root      20   0    8940    308    268 S   0.0   0.0   0:00.20 init
    6 root      20   0    8940    220    180 S   0.0   0.0   0:00.01 init
    7 user1     20   0   18080   3456   3348 S   0.0   0.1   0:00.27 bash
   72 root      20   0   18924   2900   2800 S   0.0   0.1   0:00.10 sudo
   73 root      20   0   18048   2080   2056 S   0.0   0.1   0:00.07 su
   74 root      20   0   18076   3608   3516 S   0.0   0.1   0:00.41 bash
  366 root      20   0   18444   1784   1324 T   0.0   0.0   0:00.01 top
  746 root      20   0   70528  26284   3072 R   0.0   0.6   0:01.24 massif-amd64-li

==746==
15691776

The output 15691776 is the peak memory usage of the ‘top‘ process.

By default, Massif measures only heap memory. However, if we wish to measure all the memory used by a program, we can use —pages-as-heap=yes.

Massif outputs profiling data to a massif.out file. The ms_print tool graphs this profiling data to show memory consumption over the execution of a program. It also shows detailed information about the sites responsible for allocation at points of peak memory allocation. We can graph the data from the massif.out file with the command:

ms_print massif.out

6. Other Contemporary Tools: Heaptrack and Busybox

Heaptrack is a KDE tool that has both GUI and text interfaces. It provides peak memory usage as flame graphs. It’s faster as it does less checking than Valgrind.

We can determine peak memory by tracking the amount of PSS in /proc/[pid]/smaps or use pmap. We can also attach heaptrack to an already running process:

heaptrack --pid $(pid of <your application>)

heaptrack output is written to /tmp/heaptrack.APP.PID.gz.

Busybox combines tiny versions of many common UNIX utilities into a single small executable. It provides minimalist replacements for most of the utilities we usually find in GNU Coreutils, until-Linux, and others.

Busybox is pre-installed in most Linux distros, including Debian and Ubuntu. It can be used in place of commands not present in many modern distributions. Likewise, to know peak memory usage, we can use the busybox time implementation with the -v argument:

$ /usr/bin/time busybox time -v uname -r | grep "Maximum resident set size"
Maximum resident set size (kbytes): 1792

Its output is similar to GNU time output.

7. Conclusion

In this article, we discussed commands and tools to identify peak memory usage of a process.

These commands help us visualize the process’ memory utilization in real-time so we can take necessary action. Additionally, they can assist us with gauging activities to decrease the application’s memory use.

Persistence

REST

Security