1. Overview

Text transformation is part and parcel of data analysis activity. One such scenario is where we’ve occurrences of epoch timestamps in the dataset and want to convert it to a human-readable date format.

In this tutorial, we’ll learn how to convert timestamps in a column to a comprehensible date format.

2. Scenario Setup

Let’s take a look at the sample data.txt file with a tabular structure:

$ cat data.txt
1681322596 a1 b1
1649786596 a2 b2
1618250596 a3 b3

We must notice that the first column contains timestamps which are convenient to use within computer programs but not so much useful for the human eye.

3. Using awk

In this section, we’ll use GNU awk to transform the timestamps to corresponding date values.

3.1. Using strfstime() Time Function

We can use the stftime() function available in GNU awk to convert a timestamp to a date string:

strftime (format, timestamp)

Let’s start by learning how we can convert a single timestamp using the stftime() function:

$ date +%s | tee >(awk '{$0=strftime("%Y-%m-%d %H:%M:%S", $0); print($0);}')
1681232216
2023-04-11 16:56:56

We must note that we used the date command to get the current timestamp, followed by the tee command to show it before sending it to the awk command.

Now, let’s use the logic for processing the data.txt file:

$ awk '{$1=strftime("%Y-%m-%d %H:%M:%S", $1); print $0}' data.txt
2023-04-12 18:03:16 a1 b1
2022-04-12 18:03:16 a2 b2
2021-04-12 18:03:16 a3 b3

Great! We’ve got this right.

3.2. Calling an External Program

We can also call an external program and capture its output within an awk script. As a result, we can leverage the date command for our use case.

First, let’s see how we can use the date command for timestamp-to-date conversion:

$ timestamp=$(date +%s); echo $timestamp; date -d@$timestamp +"%F %H:%M:%S"
1681325421
2023-04-12 18:50:21

Next, let’s use this within an awk command:

$ date +%s | \
tee \
>(awk '{cmd = "date -d@"$0 " +\"%F %H:%M:%S\""; cmd | getline result; close(cmd); print(result)}')
1681325991
2023-04-12 18:59:51

Finally, let’s write an awk script to process the timestamps in the data.txt file:

$ awk '{
    cmd="date -d@"$1 " +\"%F %H:%M:%S\"";
    cmd | getline $1;
    close(cmd);
    print($0)
}' data.txt
2023-04-12 18:03:16 a1 b1
2022-04-12 18:03:16 a2 b2
2021-04-12 18:03:16 a3 b3

4. Using sed

Despite being a powerful text-processing utility, sed doesn’t have built-in time functions for timestamp-to-date conversion. So, we need to call an external program within the sed script to solve our use case.

Let’s begin by learning to convert a single timestamp by calling the date command from within the sed script:

$ date +%s | tee >(sed -n -E -e "s/(.*)/date -d@\\1 +\"%F %H:%M:%S\"/ep")
1681233473
2023-04-11 17:17:53

We must note that we used the e function to execute the replacement string as an instruction set. Further, let’s also remember that we need to escape symbols such as \1 (first substitution group) and double quotes.

Now, we can write our sed script for converting timestamps in the data.txt file:

$ sed -n -E \
> -e "s/^([0-9]*)(.*)/date -d@\\1 +\"%F %H:%M:%S\" | tr -d '\n'; echo \"\\2\";/ep" \
> data.txt
2023-04-12 18:03:16 a1 b1
2022-04-12 18:03:16 a2 b2
2021-04-12 18:03:16 a3 b3

Over here, we’ve used two substitution groups, namely \1 and \2, for the first and rest of the columns, respectively. Furthermore, we’ve used the tr command to remove the trailing newline from the output of the preceding date command.

5. Using Bash Script

We can use Bash arrays and iterations to parse the tabular contents from the data.txt file. Furthermore, we can use the date command to convert the timestamps to date strings and concatenate them with the rest of the fields.

Let’s look at the timestamp-to-date.sh script in its entirety:

$ cat timestamp-to-date.sh
#!/bin/bash

readarray -t rows < data.txt
TIME_COL=1

for row in "${rows[@]}"
do
    col_count=$(echo $row | wc -w)
    cur_col=1
    while :
    do
        cell_val=$(echo -e $row | cut -d" " -f$cur_col)
        if [[ $cur_col == $TIME_COL ]]
        then
            cell_val=$(date -d@${cell_val} +"%F %H:%M:%S")
        fi
        printf "%s " "$cell_val"
        cur_col=$[$cur_col+1]
        [[ $cur_col > "$col_count" ]] && break
    done
    
    echo ""
done
exit 0

We must note that we’re using the cut command to extract individual column values.

Now, let’s see the script in action:

$ ./timestamp-to-date.sh 
2023-04-12 18:03:16 a1 b1 
2022-04-12 18:03:16 a2 b2 
2021-04-12 18:03:16 a3 b3

The result looks perfect.

6. Conclusion

In this article, we explored multiple utilities such as awk, sed, and Bash script to convert timestamps to date strings.