1. Overview

Filtering log file entries based on a date range is an important task in system administration. Log files provide a record of system events and errors that can be useful for troubleshooting and monitoring. We can use the timestamped entries in a log file to filter content within a specific date range.

In this tutorial, we’ll explore how to filter log file entries using Bash scripts based on a single date range or multiple ones.

2. Sample Task

Let’s suppose we want to filter entries in the /var/log/auth.log file. Inspecting the log file, we observe entries with timestamps ranging from Sep 17 till the current date:

$ sudo cat /var/log/auth.log
Sep 17 06:22:17 debian login[652]: pam_unix(login:session): session opened for user sysadmin by LOGIN(uid=0)
...
Sep 23 08:34:24 debian sudo: pam_unix(sudo:session): session opened for user root by (uid=0)

Our sample task is two-fold:

  1. extract log entries with timestamps falling within the last hour from the current time
  2. extract log entries for Sep 19-21 between 10:00:00 and 10:05:00

For the first objective, we specify a single start and end time, while in the second, we specify multiple start and end times spanning across three days.

Let’s see how we can solve this task.

3. Extracting Logs Between a Single Start and End Date

To extract log entries from /var/log/auth.log spanning the last hour till the current time, we specify the start and end times as Unix timestamps using the date command:

$ start_time="$(date -d 'now - 1 hour' +'%s')"
$ end_time="$(date -d now +'%s')"

The -d option specifies a date such as now for the current time or now – 1 hour for 1 hour ago. On the other hand, the +’%s’ format converts the dates to Unix time. Thus, we can compare dates numerically.

Next, we extract the timestamps from /var/log/auth.log. To do so, we can use the cut command to extract the timestamps:

$ sudo cat /var/log/auth.log | cut -d ' ' -f 1-3
...
Sep 23 08:35:13
Sep 23 08:35:13

The -d option used with cut specifies the delimiter, whereas the -f option specifies the fields to extract by their consecutive numbers.

After this, we can use a while loop to read and extract the timestamp from each log entry before converting it to Unix time. Then, we compare these Unix timestamps against the start and end times specified earlier.

The filter_log.sh script implements the entire procedure:

$ cat filter_log.sh
#!/usr/bin/env bash
start_time="$(date -d 'now - 1 hour' +'%s')"
end_time="$(date -d now +'%s')"

sudo cat /var/log/auth.log | while read -r line; do
    time_stamp=$(echo "$line" | cut -d ' ' -f 1-3)
    time_stamp=$(date -d "$time_stamp" +'%s')
    if [ "$time_stamp" -ge "$start_time" -a "$time_stamp" -le "$end_time" ]; then
        echo "$line"
    fi
done

To summarize, the script carries out several steps:

  1. specify a shebang directive in the first line of the script
  2. set the start and end times as Unix timestamps using the date command
  3. use a while loop to iterate over the lines of the /var/log/auth.log file
  4. read a line using the read command
  5. extract the timestamp from the line using the cut command
  6. convert the timestamp to Unix time using the date command
  7. use the test built-in to compare the timestamp to the start and end times: if it falls within the specified range, print the line
  8. go to step 3 and iterate over the next line, or stop if the end of the file is reached

Next, we grant the script execute permissions using chmod:

$ chmod +x filter_log.sh

Finally, we run the script:

$ ./filter_log.sh
Sep 23 07:45:01 debian CRON[29156]: pam_unix(cron:session): session opened for user root by (uid=0)
...
Sep 23 08:35:41 debian sudo: pam_unix(sudo:session): session opened for user root by (uid=0)

Notably, the extracted log entries fall within the last hour from the current time.

4. Extracting Logs Between Multiple Start and End Dates

We can set up a script named multi_range_filter.sh to extract log entries spanning Sep 19-21 between 10:00:00 and 10:05:00:

$ cat multi_range_filter.sh
#!/usr/bin/env bash
start_times=()
end_times=()
for day in {19..21}; do
    start_times+=("$(date -d "Sep $day 2023 10:00:00" +'%s')")
    end_times+=("$(date -d "Sep $day 2023 10:05:00" +'%s')")
done

n="${#start_times[@]}"
sudo cat /var/log/auth.log | while read -r line; do
    time_stamp=$(echo "$line" | cut -d ' ' -f 1-3)
    time_stamp=$(date -d "$time_stamp" +'%s')
    for index in $(seq 0 $((n-1))); do
        start_time="${start_times[$index]}"
        end_time="${end_times[$index]}"
        if [ "$time_stamp" -ge "$start_time" -a "$time_stamp" -le "$end_time" ]; then
            echo "$line"
        fi
    done
done

The script implements a series of steps:

  1. specify a shebang directive in the first line of the script
  2. initialize the start_times and end_times variables as arrays
  3. use a for loop to fill in each array with three values by varying the day variable
  4. save the number of elements of the start_times array in a variable named n
  5. use a while loop to iterate over the lines of the /var/log/auth.log file
  6. read a line using the read command
  7. extract the timestamp from the line using the cut command
  8. convert the timestamp to Unix time using the date command and save the result in the time_stamp variable
  9. use a for loop to iterate an index variable from 0 to n-1
  10. define the start_time and end_time variables as elements from the start_times and end_times arrays based on the index variable
  11. compare the time_stamp variable to the start_time and end_time variables: if it falls in between, print the corresponding line
  12. go to step 9 and iterate over the next value of the index variable, or continue to the next step if there are no more values to iterate over
  13. go to step 5 and iterate over the next line, or stop if the end of the file is reached

Notably, we introduce an inner for loop to iterate over each pair of corresponding start and end dates. Therefore, as we read each line of the log file, we check if its Unix timestamp falls within any of the three date ranges. If so, we print the line.

Finally, we can run and test the script:

$ ./multi_range_filter.sh
Sep 19 10:00:01 debian CRON[19718]: pam_unix(cron:session): session opened for user sysadmin by (uid=0)
Sep 19 10:00:01 debian CRON[19718]: pam_unix(cron:session): session closed for user sysadmin
Sep 20 10:00:01 debian CRON[16730]: pam_unix(cron:session): session opened for user sysadmin by (uid=0)
Sep 20 10:00:01 debian CRON[16730]: pam_unix(cron:session): session closed for user sysadmin
Sep 21 10:00:01 debian CRON[2051]: pam_unix(cron:session): session opened for user sysadmin by (uid=0)
Sep 21 10:00:02 debian CRON[2051]: pam_unix(cron:session): session closed for user sysadmin

We see that the timestamps of the extracted log entries fall within the three date ranges.

5. Conclusion

In this article, we explored how to filter the entries of a log file based on a date range or a series of ranges.

In particular, we used a scripting approach that relies on extracting and converting the timestamps in the log file to Unix time. The timestamps are then compared numerically to those defining the date range or ranges, and log entries are printed only if their timestamps falls within.