1. Introduction

In the realm of modern computing, a multitude of operations keep a system running. The Linux command-line interface (CLI) unquestionably acts as an impressive tool for both developers and system administrators due to its ability to identify and manage specific processes.

In this tutorial, we’ll delve into the Linux CLI’s capabilities, offering insights on efficient ways to identify specific Python processes running on our system.

2. Quick Overview of Linux CLI and Python

A Linux system manages various processes, each significantly contributing to its functionality. Further, these processes are the executing instances of various applications and scripts, working together in harmony to perform various tasks.

Python, a versatile and widely used programming language, is evidently prevalent in scripting and application development across diverse domains. Consequently, multiple Python scripts can concurrently run on a system, hence demanding a way to pinpoint and manage a specific one.

Therefore, it becomes essential to identify specific Python processes through the Linux CLI, be it for troubleshooting, resource optimization, or understanding system dynamics.

3. Using the ps Command

The ps command is a powerful tool that provides information about the running processes. Moreover, this command, along with other helper commands such as grep, can identify and earmark the Python processes in particular.

Now, let’s see the usage of the ps command:

$ ps -ef | grep -i '[p]ython'
root        1196       1  0 Aug02 ?        00:00:01 /usr/libexec/platform-python -s /usr/sbin/firewalld --nofork --nopid
nginx      50630       1  0 Aug02 ?        00:05:25 /opt/genops-auth/.env/bin/python3 /opt/genops-auth/server/sasserv.py start
nginx      50662       1  0 Aug02 ?        00:00:58 /opt/genops-search/.env/bin/python /opt/genops-search/falcon/falcon.py start
...
... output truncated ...
...

First, ps -ef retrieves a list of all running processes in detail. The | redirects the output of the ps command as the input to the next command. The grep command searches through the input for lines that match the pattern ‘[p]ython’. The -i option of grep helps to perform a case-insensitive search. Lastly, the square brackets around the ‘p’ in ‘python’ prevent the grep command itself from being matched.

3.1. Multithreading Python Processes

We have a command to comprehend how Python applications work explicitly in a multi-threaded environment:

$ ps -eLf | grep -i '[p]ython'
UID          PID    PPID     LWP  C NLWP STIME TTY          TIME CMD
nginx      50672       1   50672  0    1 Aug02 ?        00:02:46 /opt/genops/.env/bin/python /opt/genops/.env/bin/gunicorn -c /etc/genopsflow/genopsflow.conf.py
nginx      50733   50672   50733  0    8 Aug02 ?        00:00:16 /opt/genops/.env/bin/python /opt/genops/.env/bin/gunicorn -c /etc/genopsflow/genopsflowcache.conf.py
...
... output truncated ...
...

Here, we see in the PPID column that the process with PID 50672 is a parent of the process with PID 50733. In this way, the command helps to identify the Python processes along with their individual child processes.

3.2. Sorting Python Processes

Furthermore, we have a command that lists out Python processes in descending order of memory usage:

$ ps aux --sort=-%mem | grep -i '[p]ython'
USER         PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
nginx      50277  0.3 18.2 9447928 2960576 ?     Ssl  Aug02  84:13 /opt/genops/.env/bin/python /opt/genops/.env/bin/gunicorn -c /etc/genopsflow/genopsflow.conf.py
nginx      50703  0.0  2.8 9434416 460180 ?      Sl   Aug02  19:40 /opt/genops-search/.env/bin/python /opt/genops-search/falcon/falcon.py
...
... output truncated ...
...

In cases where Python processes consume the most memory, this command finds its purpose.

Alternatively, we can also sort Python processes based on their CPU usage:

$ ps aux --sort=-%cpu | grep -i '[p]ython'
USER         PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
nginx      50274  0.5  0.7 2999820 120704 ?      Ssl  Aug02 111:17 /opt/genops-auth/.env/bin/python3 /opt/genops-auth/server/sasserv.py
nginx      50277  0.3 18.2 9447928 2960576 ?     Ssl  Aug02  84:13 /opt/genops/.env/bin/python /opt/genops/.env/bin/gunicorn -c /etc/genopsflow/genopsflow.conf.py
...
... output truncated ...
...

Similarly, this enables us to easily pinpoint those Python processes devouring the most CPU resources.

This approach narrows down the output of the ps command to focus specifically on Python-related processes and their associated threads from a resource utilization perspective, thereby facilitating their easy identification and management without getting overwhelmed by unrelated information.

4. Using the pgrep Command

The pgrep utility helps search for the running processes that match a specific pattern. The pattern [p]ython is a clever technique to avoid matching the pgrep process itself in the output. The square brackets around the first letter of the process name create a character class, so the pattern [p]ython matches any process with the name “python” but not the literal string [p]ython.

The -a option will display the entire command line for each matching process, while the -f option instructs pgrep to match the pattern against the processes:

$ pgrep -af '[p]ython'
1196    /usr/libexec/platform-python -s /usr/sbin/firewalld --nofork --nopid
50630    /opt/genops-auth/.env/bin/python3 /opt/genops-auth/server/sasserv.py start
50662    /opt/genops-search/.env/bin/python /opt/genops-search/falcon/falcon.py start
...
... output truncated ...
...

Here, the processes match the pattern [p]ython along with their corresponding process IDs (PIDs) and command lines. In this way, the pgrep command finds and displays the details of running processes that comprise python in their command lines, excluding the pgrep process itself.

5. Searching in the /proc Path

The /proc path is a special directory in Unix-like operating systems, including Linux, that provides an interface to kernel data structures and information about running processes. It presents, by and large, a virtual filesystem, allowing users and programs to access details about the system’s operations, hardware, and more. Essentially, this path contains numbered directories named after the PIDs of running processes. Inside each PID directory, files such as “cmdline” store the command-line arguments that launched a process:

$ ls /proc/57062
status        coredump_filter    cmdline        maps        mountstats    stack        syscall
clear_refs    cwd        fdinfo        mem        environ        limits        sched
...
... output truncated ...
... 

Generally, searching for a process under the /proc route has the benefit of providing direct access to comprehensive details on currently active operations, such as their command-line parameters, memory usage, and others. Eventually, this can help in monitoring, troubleshooting, and comprehending the behavior of these attributes.

Besides, the /proc path serves as a real-time window into the system’s inner workings, facilitating deeper insights into program execution and aiding in diagnosing issues efficiently.

Now, let’s check how to extract the process details from the /proc directory:

$ cat get_python_process_info.sh
#!/bin/bash
IFS=" " read -ra pids << (pgrep -f python)
for pid in "${pids[@]}"; do
    printf '%d: ' "$pid"
    tr '\0' ' ' < "/proc/$pid/cmdline"
    echo
done

Next, we can search through the Python tasks and find which Python program each task is running:

$ bash get_python_process_info.sh
386257: /opt/cyops-auth/.env/bin/python3 /opt/cyops-auth/server/csdassrv.py start
...
... output truncated ...
...

Now, let’s understand each line of our script:

  • IFS=” “ helps split things up properly.
  • read -ra pids < <(pgrep -f python) uses pgrep to find Python tasks and stores their IDs in an array called pids.
  • for pid in “${pids[@]}”; do starts a loop, going through each task ID.
  • printf ‘%d: ‘ “$pid” prints the task ID.
  • tr ‘\0’ ‘ ‘ < “/proc/$pid/cmdline” reads the “cmdline” file for each task and changes some characters to spaces.
  • Lastly, echo adds a line break to make things neat.

6. Conclusion

In short, the Linux CLI is an indispensable tool for managing Python processes in modern computing. In this article, we first explored the usage of ps and grep to locate, troubleshoot, and optimize processes by memory and CPU. Further, we perceived how pgrep aids in finding processes by employing a smart pattern to avoid self-matching.

Lastly, we learned how the /proc path offers real-time insights into process execution. Overall, the Linux CLI empowers Python process management for system stability and efficient performance.