1. Overview
As a multitasking operating system, Linux shares its resources between processes. One of these resources is CPU time. Usually, the users’ processes run with time-sharing scheduling while the kernel’s tasks use real-time. However, we can change the scheduling policies to meet our needs.
In this tutorial, we’ll learn how to manage these policies. Then, we’ll walk through some scheduling examples.
2. The Basics of Scheduling
First, let’s emphasize that the main goal of real-time scheduling is to run tasks in a predictable manner. In this way, we can meet the requirements for real-time systems. Accordingly, the real-time policies rely on priority, which takes value in the 1 through 99 range. Thus, a thread of higher priority always takes precedence over that of lower priority. On the other hand, all time-sharing threads have the same priority of zero value. So, all real-time tasks obtain the CPU resource before time-sharing threads.
Basically, we can work with three policies:
- SCHED_FIFO – First In First Out real-time policy – threads of the same priority are queued in the order of arrival. Then, the first thread obtains the CPU
- SCHED_RR – simple round-robin real-time scheduling which extends the FIFO scheme. All threads with the same priority receive the CPU in turn
- SCHED_OTHER – time-sharing scheduling, implemented as the Completly Fair Scheduling (CFS)
It’s worth noting that the Linux system is preemptive, which means that kernel can take CPU access away from the thread and return it when that thread’s turn comes.
Finally, as we’ll be using top, let’s mention that it maps positive real-time priorities into negative values. So, the higher the process priority, the lower value in the PR column shown by top.
3. The chrt Command
With the chrt command, we can examine or set the process’ scheduling attributes. In addition, we can start a new process with the given priority and scheduling policy.
First, let’s see the command’s syntax:
chrt [options] -p [priority] PID
chrt [options] priority command argument ...
So, the first variant allows us to manipulate properties of the running process by means of its PID, while the second enables running a command.
Next, let’s list the scheduling policies and their corresponding priority ranges with the -m option:
$ chrt -m
SCHED_OTHER min/max priority : 0/0
SCHED_FIFO min/max priority : 1/99
SCHED_RR min/max priority : 1/99
# ...
To select the SCHED_OTHER, SCHED_FIFO, or SCHED_RR policy, we should use the o, f, or r options respectively.
3.1. Changing Properties of Running Process
For our first example, let’s start a stress-ng process for two minutes:
$ stress-ng --cpu 1 --timeout 120s
Then, let’s find the PID of the running stressor:
$ ps -eo command,stat,pid | grep ^stress-ng-cpu
stress-ng-cpu [run] R+ 7575
Now, we’ll use the –r option to change the process scheduling policy to round-robin and set its priority to 2:
$ sudo chrt -r -p 2 7575
Finally, let’s confirm the change:
$ chrt -p 7575
pid 7575's current scheduling policy: SCHED_RR
pid 7575's current scheduling priority: 2
3.2. Running a Command
Next, let’s start a process with the FIFO policy indicated by the –f option and the highest priority 99:
$ sudo chrt -f 99 stress-ng --cpu 1 --timeout 90s
Now let’s display the top data of this process:
$ top -p 5278
# ...
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
5278 root rt 0 45780 5900 3528 R 95,0 0,0 0:42.33 stress-+
So, rt in the PR column tells us that the process is real-time scheduled with the highest priority.
4. Simulating a Real-Time Machine
Now let’s simulate a one-processor machine to explore various aspects of real-time scheduling. Thus, we’re going to create a one-processor CPU set with the help of the cset command:
$ sudo cset shield --sysset=ts-set --userset=rt-set --cpu=3 --kthread=on
So now we have the system set called ts-set for time-share and the userset set named rt-set for real-time. In addition, we set the kthread option to move out from the newly created rt-set as many kernel threads as possible.
Subsequently, let’s move the current Bash shell process into the shield. In this way, all threads started from the corresponding terminal will stay in the user set:
$ sudo cset shield --sysset=ts-set --userset=rt-set --shield --pid=$$
cset: --> shielding following pidspec: 4508
cset: done
In the next step, let’s change the policy to round-robin and set the priority to 2. Then, any process which runs in this terminal will inherit these attributes:
$ sudo chrt -r -p 2 $$
Finally, let’s check the shielded processes:
$ cset shield --sysset=ts-set --userset=rt-set --shield --verbose
cset: "rt-set" cpuset of CPUSPEC(3) with 1 task running
USER PID PPID SPPr TASK NAME
-------- ----- ----- ---- ---------
joe 4508 2610 Sr_2 bash
cset: done
So, we can learn from the SPPr column, that the Bash process is sleeping now (S), its policy is round-robin (r) and the priority equals 2.
5. Running Tasks With Round-Robin Scheduling
As we have the real-time machine simulator ready in the terminal, let’s start the stress-ng hogs from inside it:
$ stress-ng --cpu 2 --timeout 90s
Then, let’s see the processes in the real-time set rt-set:
$ cset shield --sysset=ts-set --userset=rt-set --shield --verbose
cset: "rt-set" cpuset of CPUSPEC(3) with 4 tasks running
USER PID PPID SPPr TASK NAME
-------- ----- ----- ---- ---------
joe 4508 2610 Sr_2 bash
joe 9355 4508 Sr_2 stress-ng --cpu 2 --timeout 90s
joe 9356 9355 Rr_2 stress-ng-cpu [run] joe 9357 9355 Rr_2 stress-ng-cpu [run]
cset: done
So, we have two running instances of stress-ng-cpu, with the inherited round-robin policy and the same priority 2.
Next, let’s get the top output for these very processes:
$ top -p 9356,9357
# ...
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
9357 joe -3 0 45784 5880 3500 R 48,7 0,0 0:12.24 stress-ng
9356 joe -3 0 45784 5880 3500 R 46,7 0,0 0:12.25 stress-ng
We can find out that, on average, they share a single CPU’s time.
6. Priorities and Preempting
Now let’s understand how the priorities work. So, let’s change the scheduling policy of our simulator shell to FIFO, with priority 2. Let’s type this command in the simulator’s terminal:
$ sudo chrt -f -p 2 $$
Then, let’s start a one CPU stressor to simulate a long-time task:
$ stress-ng --cpu 1 --timeout 240s
We can observe in the top output, that it consumes almost 100% of the only shield processor:
$ top -p 9235
# ...
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
9235 joe -3 0 45780 5996 3624 R 95,0 0,0 0:51.22 stress-ng
Next, let’s open a fresh terminal and start another hog outside the shield:
$ stress-ng --cpu 1 --timeout 120s
Assuming that its PID is 9256, let’s change its priority to 3 and move to the shield:
$ sudo chrt -f -p 3 9256
$ sudo cset shield --sysset=ts-set --userset=rt-set --shield --pid 9256
Afterwards, let’s list the shield’s dwellers:
$ cset shield --sysset=ts-set --userset=rt-set --shield --verbose
cset: "rt-set" cpuset of CPUSPEC(3) with 4 tasks running
USER PID PPID SPPr TASK NAME
-------- ----- ----- ---- ---------
joe 4508 2610 Sf_2 bash
joe 9234 4508 Sf_2 stress-ng --cpu 1 --timeout 240s
joe 9235 9234 Rf_2 stress-ng-cpu [run]
joe 9256 9255 Rf_3 stress-ng-cpu [run]
cset: done
So, both stressors are reported as running and using the FIFO policy now. However, now the top command shows that only the stressor of PID 9256 accesses the CPU:
$ top -p 9235,9256
# ...
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
9256 joe -4 0 45780 5900 3516 R 95,0 0,0 0:40.77 stress-ng
9235 joe -3 0 45780 5996 3624 R 0,0 0,0 2:08.26 stress-ng
So, we’ve seen how the process of higher priority takes over the resource. At the same time, the lower priority task is preempted, (i.e., kicked off the CPU). However, its status remains running (R), as the preemption doesn’t cause the task to sleep.
7. Room for Time-Sharing
Now, let’s examine a situation that’s a bit different. First, let’s set the priority of the simulator to the highest possible value, 99. So, let’s issue the command in the simulator’s terminal:
$ sudo chrt -f -p 99 $$
Next, we’ll start a stressor in the rt-set shield as before:
$ stress-ng --cpu 1 --timeout 120s
Then, let’s start another task outside the simulator and add it to rt-set, but without changing its scheduling policy with chrt:
$ stress-ng --cpu 1 --timeout 120s
$ sudo cset shield --sysset=ts-set --userset=rt-set --shield --pid 5774
Finally, let’s take a look at the top output for these processes:
$ top -p 5745,5774
# ...
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
5774 joe rt 0 45780 5896 3512 R 95,0 0,0 2:11.35 stress-+
5745 joe 20 0 45780 5988 3616 R 5,0 0,0 2:09.65 stress-+
In contrast to the prioritized task from the previous example, now the second process acquires a share of the CPU time. To explain that, let’s check the processes’ details:
$ cset shield --sysset=ts-set --userset=rt-set --shield --verbose
cset: "rt-set" cpuset of CPUSPEC(3) with 4 tasks running
USER PID PPID SPPr TASK NAME
-------- ----- ----- ---- ---------
joe 4508 5448 Sf99 bash
joe 5745 5744 Roth stress-ng-cpu [run]
joe 5773 5449 Sf99 stress-ng --cpu 1 --timeout 240s
joe 5774 5773 Rf99 stress-ng-cpu [run]
cset: done
We can observe that the process of PID 5745 runs with the SCHED_OTHER policy. So, the kernel provides it with some amount of the resource. In detail, we can find the kernel’s scheduling period in the /proc/sys/kernel/sched_rt_period_us file. Its default value is 1000000 microseconds, (i.e., 1s). Then, in the /proc/sys/kernel/sched_rt_runtime_us file, we can read how much of this time is reserved for real-time tasks. By default, it’s 0.95s. So, the remaining 0.05s is committed to keeping time-sharing tasks running.
8. Conclusion
In this article, we learned about real-time scheduling in Linux. First, we briefly looked through the different scheduling policies. Then, we used the chrt command to manipulate processes’ policies and priorities.
Next, we created a one-processor cpuset to highlight specific aspects of real-time scheduling. We saw how two processes of the same priority run with round-robin scheduling. Following this, we demonstrated the preempting of a lower priority process in the FIFO scheme. Finally, we discussed sharing the same CPU by the prioritized and normal tasks.