1. Overview

The cron daemon allows the scheduling of tasks. It allows us to control when the tasks start, but little else.

In particular, cron has no way to prevent two executions of the same task from overlapping. If we want to avoid concurrent execution of our tasks, we need to implement it ourselves within the task’s script.

In this tutorial, we’ll learn two ways of preventing task overlapping, based on process detection and using a .pid file.

2. Working Example

For the purpose of this tutorial, we’ll use a bash script as the task. We’ll build it up, a technique at a time, so it will prevent a second instance from running.

2.1. Task Script

First, let’s prepare the script:

#!/usr/bin/env bash

DURATION=$1
do_the_action () {
  date +"PID: $$ Action started at %H:%M:%S, ETA: $DURATION seconds"
  sleep $DURATION
  date +"PID: $$ Action finished at %H:%M:%S"
}

do_the_action

Here we’re using the sleep command to simulate the task duration. The first parameter – $1 – will allow us to define how long the task executes.

To help understand what’s going on during execution, the date command outputs time-stamped messages. To help understand which process is which, we’re outputting the process PID from the bash variable $$.

2.2. Scheduling the Task

Now, let’s set up a cron task to run this script every minute, using a duration of 55 seconds. The task should end before the next instance comes up.

We’ll put our script in /tmp/action1.sh and then create a crontab record to run our task and log the output to a log file:

$ echo '*/1 * * * * /tmp/action1.sh 55 >> /tmp/action1.log' | crontab

We should observe that piping to the crontab command replaces the whole crontab of the current user. This is not a problem for our tutorial, but may not be suitable for use in production.

After letting this task run for several minutes, we can check the /tmp/action1.log file and note that there’s no overlapping:

$ tail -f /temp/action1.log
PID: 21764 Action started at 13:41:01, ETA: 55 seconds
PID: 21764 Action finished at 13:41:56
PID: 21770 Action started at 13:42:01, ETA: 55 seconds
PID: 21770 Action finished at 13:42:56

2.3. Simulate Overlapping Tasks

Now let’s increase the duration to 70 seconds to create the scenario we want to fix:

echo '*/1 * * * * /tmp/action1.sh 70 >> /tmp/action1.log' | crontab

This time we can clearly see the issue, the next task starts before the previous one finishes:

$ tail -f /temp/action1.log
PID: 21881 Action started at 13:43:01, ETA: 70 seconds
PID: 21886 Action started at 13:44:01, ETA: 70 seconds
PID: 21881 Action finished at 13:44:11

To prevent this collision, we need to implement code to detect the previous instance and abort an overlapping invocation of the script.

3. Detecting Running Instance by Process

3.1. Identifying the Task Process

In our example, the script uses bash. While it’s running, we can find it by using pgrep:

$ pgrep --list-full bash
19125 bash
21172 bash /tmp/action1.sh 70
21187 bash /some/other/script

Here we use the –list-full option to get the list of all bash processes along with their command lines.

From the output above, we can see the process 21772 seems to be an instance of our task, as its command-line contains the name of our script.

We can narrow the search down by using an additional grep with our exact script name:

$ pgrep --list-full bash | grep '/tmp/action1.sh'
21172 bash /tmp/action1.sh

It’s helpful to try these methods out on the command line before adding them to our script.

However, if we added this method to our script, it would detect every invocation of the script, including the current instance. We need to exclude this instance from the result. This is where using grep utility is the right thing:  grep -v “^$$ “ will filter out all lines starting with the current PID and filter out the current invocation.

3.2. Implementing the Code

Now, let’s add our detector to a function in the script:

previous_instance_active () {
  pgrep -a bash | grep -v "^$$ " | grep --quiet '/tmp/action1.sh' 
}

We should note the addition of the –quiet option to the last grep command. This prevents detection output from appearing in the log file.

Next, we modify the task script to exit if the previous invocation is detected:

if previous_instance_active
then 
  date +'PID: $$ Previous instance is still active at %H:%M:%S, aborting ... '
else 
  do_the_action
fi

3.3. Testing the Result

After applying these changes, we can check the /tmp/action1.log file again to see the detection works as expected:

$ tail -f /tmp/action1.log
...
PID: 11529 Action started at 14:18:01, ETA: 70 seconds
PID: 11531 Previous instance is still active at 14:19:01, aborting ... 
PID: 11529 Action finished at 14:19:11
PID: 11545 Action started at 14:20:01, ETA: 70 seconds

Although this method seems to work well, it may not be reliable enough for every use case.

In practice, the script may not know its unique path on the file system and so might have to search for a substring containing its name.  So, if there were another instance of a bash process that happens to contain the same substring, then there would be a false positive. We may need a more robust solution.

4. Detecting Running Instance by .pid File

4.1. Use a File to Communicate the State

With this method, we utilize the .pid file technique. This way, the next task instance can detect the file left by a previous incarnation to determine if another instance is running. We’ll introduce a PIDFILE variable and two procedures for handling .pid file:

PIDFILE="/tmp/action1.pid"

create_pidfile () {
  echo $$ > "$PIDFILE"
}

remove_pidfile () {
  [ -f "$PIDFILE" ] && rm "$PIDFILE"
}

We should call these from around the code that performs the action (within the conditional logic that aborts when there’s an instance already running):

create_pidfile 
do_the_action
remove_pidfile 

4.2. Reading Messages from the Previous Instance

Let’s rewrite our previous_instance_active function to use this method:

previous_instance_active () {
  local prevpid
  if [ -f "$PIDFILE" ]; then
    prevpid=$(cat "$PIDFILE")
    kill -0 $prevpid 
  else 
    false
  fi
}

By using kill -0 here, we enforce the .pid file technique with an extra check. If it fails, then we know the previous instance PID, detected in the file, is no longer a running process.

4.3. Avoiding a Stale .pid File

The file-based detection is a lot more reliable than using pgrep, but it may still not be robust enough.

Let’s think about the case when a failure occurs in the middle of task execution. The task process disappears, but the .pid file would remain. Although we’re checking for whether that PID is active, PID can be reused by the system for other processes. This might result in false detection.

Therefore, we want to increase the chances of the .pid file being removed when the process quits. We can do that with the use of the trap instruction. The trap command binds the execution of code to system signals. We can use a special EXIT signal to make the cleanup code run on script exit whatever the exit reason is.

To add this removal-on-exit functionality, we only have to modify the final lines:

trap remove_pidfile EXIT
create_pidfile
do_the_action

We do not call the remove_pidfile procedure explicitly anymore, but instead, make bash perform that invocation as the script terminates*.*

5. Conclusion

It’s common to want single-instances of certain processes. However, when scheduling those with cron daemon, we have to implement some sort of mutex ourselves. The main challenge in solving that problem is finding a way to correctly detect active instances and abort the overlapping execution.

In this tutorial, we saw several variations of detecting an existing instance of a script. We implemented both process detection and a file-based mechanism. We also compare the reliability of the methods.

All the approaches described in this tutorial use standard Linux utilities, available in most Linux distributions.