Linux 内核低延迟配置指南

1. Overview

The Linux operating system is a general-purpose operating system that finds uses in a wide variety of cases. From daily desktop use, where responsiveness and user-friendly interfaces are paramount, to powerful servers that crunch vast amounts of data. These different use cases require us to make a tradeoff to be made between the latency and throughput of the Linux system.

As there’s no one-size-fits-all configuration, the Linux developer makes it possible to configure the kernel to alter the latency and throughput characteristics.

In this tutorial, we’ll learn about several kernel configurations for tuning the latency and throughput of the operating system.

2. Latency and Throughput

The latency metric measures the time between the initiation of an action and its response. Throughput, on the other hand, measures the amount of work completed in a given time frame. Together, the latency and throughput metrics characterize the performance of a system. However, these two metrics are often at odds with each other, presenting a fundamental trade-off when designing or tuning a system.

When we reduce the latency, the throughput of the system will suffer. This is because to reduce the latency, we’ll need to increase the frequency of the context switching to make the system feel more responsive. When there is more context switching, the kernel spends more CPU cycles not doing the actual processing. Similarly, when we want to improve the throughput, we’ll have to sacrifice the latency by reducing the frequency of the task scheduling. This increases the amount of CPU cycles spent processing the task at hand rather than for scheduling and context switching.

Optimizing the Linux kernel for low latency or high throughput depends on the specific workload we want the system to run. For example, low latency is crucial for audio processing and financial trading, where the system must react quickly to signals. Conversely, tasks that fully utilize the CPU, such as video encoding, benefit from high throughput configurations. In these cases, minimizing context switching allows the CPU to focus more on the intensive work of encoding the video.

3. Low Latency Configuration in Linux Kernel

There are some kernel configurations for changing the latency characteristics. For example, the Preemption Model, the Timer Frequency, and the threaded IRQ flag are configurations that change the latency performance by altering the kernel behavior.

Notably, these configurations are baked into the kernel during compilation. Therefore, we need to recompile the Linux kernel to change the configuration.

3.1. Preemption Config

In the Linux kernel, the Preemption Model configuration controls the preemption mode. By default, the kernel would use the traditional preemption mode, PREEMPT_NONE. In this mode, kernel code proceeds without interruption until it returns to user space, where preemption becomes permissible. Consequently, the lack of preemption points in the kernel code means that longer delays are possible with this mode.

Next, the PREEMPT_VOLUNTARY mode introduces additional preemption points into the kernel code. This allows the kernel code more opportunities to relinquish the CPU for rescheduling purposes than the PREEMPT_NONE.

Next, the third mode, PREEMPT mode improves upon the PREEMPT_VOLUNTARY by making most of the kernel code preemptible, except for the most critical sections.

In summary, PREEMPT_NONE* offers the poorest latency performance, followed by *PREEMPT_VOLUNTARY. The PREEMPT mode provides the best latency performance among the three modes.

3.2. Timer Frequency

In Linux, the timer consistently fires interrupt requests to the processor at a fixed interval. Then, the timer interrupt handler performs various housekeeping activities on the system, including task scheduling. Therefore, the timer interrupt frequency directly affects how frequently the kernel performs context switching to reschedule the tasks.

The kernel configuration Timer Frequency controls the frequency of the timer interrupt. By default, the kernel uses the value of HZ_250, which stands for 250 Hz. This means the timer interrupt happens 250 times spread evenly over a second-time slice.

To optimize the latency at the cost of throughput, we can change this configuration value to HZ_1000. With this change, the kernel will run the timer interrupt handling code once every 1/1000th second, which is four times as frequent as the default value. As a result, the kernel will perform context switching four times more in a given unit of time.

3.3. Threaded IRQ

In Linux, an interrupt request (IRQ) is a signal that hardware sends to the processor, requesting immediate attention. Subsequently, the processor invokes an interrupt handler function to manage the request. Typically, interrupt handling is divided into two parts: the top half and the bottom half.

The top half, often referred to as the hard IRQ executes immediately. Additionally, the hard IRQ has to be completed within a strictly defined duration. Due to this time constraint, the processing is generally minimal. Specifically, only the most critical handling would be done in the top half of an IRQ handling. Consequently, the interrupt handler defers non-critical processing by scheduling it for later execution. This non-critical processing of an IRQ is the bottom half of an IRQ handling.

Generally, the bottom half of the interrupt handling uses the softirqs (software interrupt requests) mechanism. The difference between softirqs and hard IRQ is that the softirqs are scheduled by the kernel for works that can be deferred. Importantly, the softirqs execute the handling code in an atomic context instead of a process context. Executing codes in an atomic context instead of a process context means the execution cannot be interrupted. When we have execution that cannot be interrupted, our latency suffers.

To further improve the system latency, we can set the IRQ_FORCED_THREADING kernel option. This configuration mandates the usage of a threaded mechanism for the bottom half of all the IRQ handling, excluding those marked by IRQF_NO_THREAD. With the threaded approach, the kernel executes the handling in the process context. This makes the bottom half of the IRQ handling preemptible.

4. Conclusion

In this tutorial, we’ve learned that we can configure the Linux kernel to tradeoff between latency and throughput. Then, we’ve expanded on the latency and throughput metrics in the context of the Linux kernel. Additionally, we’ve seen how they’re at odds against each other and it’s a tradeoff we have to make depending on the use case.

Subsequently, we’ve looked at the different kernel configurations that can alter the latency and throughput characteristics of a system. Firstly, we’ve seen that the PREEMPT mode will offer the best latency characteristics. Then, the HZ_1000 configuration increases the task scheduling rate. Finally, the IRQ_FORCED_THREADED flag ensures that eligible bottom-half handling of an IRQ is preemptible.

Persistence

REST

Security