基于文件属性重命名文件

1. Introduction

Renaming files based on attributes of those files enables us to organize data in an efficient manner. Sysadmins use this method for various kinds of archival and migration operations.

In this tutorial, we’ll explore several ways to rename files using shell commands. In particular, we’ll see examples of renaming a file based on several attributes:

date
owner
author

We’ll use a Ubuntu 22.04 environment for running the examples.

2. Creation and Modification Date

To begin with, let’s find the creation date and modification date of a file and then rename the file based on those pieces of information.

2.1. File Dates

One way to print the date attributes of a file is the stat command:

$ stat --format "%W" disk.json
1689992509
$ stat --format "%Y" disk.json
1689992767

As illustrated above, we use the stat command to get the %W creation and %Y modification time of the files. Both times are in seconds since UNIX epoch.

So, let’s use the date command to convert this date format to a more human-readable string:

$ date --date=@1694498929 "+%Y_%m_%d-%H-%M-%S"
2023_09_12-11-38-49

Specifically, we use –date and the @ prefix to specify the seconds since the UNIX epoch, while +%Y_%m_%d-%H-%M-%S specifies the output format of the date:

%Y_%m_%d prints the year, month, and day with underscore separators
%H-%M-%S prints the hours, minutes, and seconds with dash separators

In summary, we explored the usage of the date and stat commands to get the file attributes.

2.2. Renaming the File

Finally, using all the above commands, we can rename files based on the creation date using the mv command:

$ mv ind.json ind_$(date --date=@$(stat --format "%W" ind.json) "+%Y_%m_%d-%H-%M-%S").json
$ mv disk.json ind_$(date --date=@$(stat --format "%W" disk.json) "+%Y_%m_%d-%H-%M-%S").json
$ ls
ind_2023_07_20-05-13-27.json disk_2023_07_22-07-51-49.json

As we can see above, the command performs several actions:

stat –format “%W” ind.json gets the UTC time as a UNIX timestamp
date –date=@$(stat –format “%W” ind.json) “+%Y_%m_%d-%H-%M-%S” converts the UTC time to human-readable format to form a filename
mv renames the file

Finally, we check the new names of both files are in effect via ls.

In summary, we used the creation date of a specified file, to generate a new file name and rename it.

3. Author of a PDF File

To get the author of a PDF file and rename the file based on that, we’ll use the pdfinfo command:

$ pdfinfo pdf_sample.pdf
Title: PDF Bookmark Sample
Author: Accelio Corporation
[...]

Here, we see the Author line. Let’s extract it via grep:

$ pdfinfo pdf_sample.pdf | grep "Author" 
Author: Accelio Corporation

Now, we remove the field name via awk:

$ pdfinfo pdf_sample.pdf | 
          grep "Author" | 
          awk -F':' -e '{ gsub(/^\s+|\s+$/,"",$2); print $2 }'
Accelio Corporation

Then, we remove spaces via tr to ensure compatibility of the name with more tools:

$ pdfinfo pdf_sample.pdf | 
          grep "Author" | 
          awk -F':' -e '{ gsub(/^\s+|\s+$/,"",$2); print $2 } |
          tr -s ' ' '_'
Accelio_Corporation

Finally, we rename the file by encapsulating the above via command substitution and check the result with ls:

$ mv pdf_sample.pdf $(
  pdfinfo pdf_sample.pdf |
    grep "Author" |
    awk -F':' -e '{ gsub(/^\s+|\s+$/,"",$2); print $2 }' |
    tr -s ' ' '_'
  )_pdf_sample.pdf
$ ls
Accelio_Corporation_pdf_sample.pdf

We can see above, that the final author extraction command performs several actions before renaming the file based on the author of the PDF file:

pdfinfo pdf_sample.pdf | grep “Author” extracts the Author line from the output of the pdfinfo command
awk -F’:’ -e ‘{ gsub(/^\s+|\s+$/,””,$2); print $2 }’ removes spaces from the prefix and suffix of the string
tr -s ‘ ‘ ‘_’ substitutes the intermediate spaces with _ underscore
mv renames the file
ls confirms the results

Finally, we check the new name of the file using ls.

To emphasize, we used a pipeline of commands to extract the author of the PDF file, create a file name, and finally, rename the file.

4. Owner of a Media File

To demonstrate, we use the exiftool command to get the name of a media file owner and rename the file based on that:

$ exiftool canon_400.jpg | head
ExifTool Version Number : 12.40
File Name : canon_400.jpg
Directory : .
File Size : 9.0 KiB
File Modification Date/Time : 2023:09:17 13:11:21+05:30
Owner Name : Jean-Pierre Grignon
[...]

Here, we see the Owner Name line. Again, we extract it with grep:

$ exiftool canon_400.jpg |
    grep "Owner"
Owner Name : Jean-Pierre Grignon

Next, we extract the name alone by splitting out the field via cut:

$ exiftool canon_400.jpg | grep "Owner" | cut -f2 -d ':'
 Jean-Pierre Grignon

Then, we remove any leading whitespace:

$ exiftool canon_400.jpg | 
            grep "Owner" | 
            cut -f2 -d ':' | 
            sed -e 's#^ \+##g'
Jean-Pierre Grignon

Similarly, let’s convert the spaces to underscores:

$ exiftool canon_400.jpg | 
            grep "Owner" | 
            cut -f2 -d ':' | 
            sed -e 's#^ \+##g' | 
            tr -s ' ' '_'
Jean-Pierre_Grignon

Finally, we rename the file and verify:

$ mv canon_400.jpg $(
  exiftool canon_400.jpg |
    grep "Owner" |
    cut -f2 -d':' |
    sed -e 's#^ \+##g' |
    tr -s ' ' '_'
  )_canon_400.jpg
$ ls
Jean-Pierre_Grignon_canon_400.jpg

In general, we create a pipeline of commands to rename the file based on the owner of the media file:

exiftool pdf_sample.pdf | grep “Owner” extracts the Owner line from the output of the exiftool command
cut -f2 -d’:’ extracts the second field from the line, using: colon as the field separator**
sed -e ‘s#^ \+##g’ trims spaces from the beginning and end of the string
tr -s ‘ ‘ ‘_’ substitutes the intermediate spaces with _ underscore
mv renames the file
ls shows the new file name

In particular, we combined the commands cut, sed, and tr to generate the file name. Finally, we applied the name with mv.

5. Copying a Collection of Files

We’ll use the find command to get a collection of files based on criteria and copy the files:

$ find /var/log/cups/ -name "*.gz" -printf "%p\n"
/var/log/cups/error_log.2.gz 
/var/log/cups/error_log.6.gz
[...]

Now, let’s check our script that leverages this command:

$ cat rename.sh
#!/usr/bin/env bash 
for src in $(find /var/log/cups/ -name "*.gz" -printf "%p\n"); 
do 
    dst=$(basename $src)
    cp $src $(date --date=@$(stat --format "%W" $src) "+%Y_%m_%d-%H-%M-%S")_$dst
done
$ ./rename.sh
$ ls -1
2023_09_28-10-16-40_error_log.2.gz
2023_07_20-05-02-23_error_log.6.gz

To be clear, the script uses a sequence of Bash commands to rename the files based on a search criterion using the find command:

find /var/log/cups/ finds the files in the given directory
-name “*.gz” searches the files which match the pattern “*.gz”
-printf “%p\n” prints each of the results on a separate line
basename $src extracts the filename from the directory name
sed -e ‘s#^ \+##g’ trims spaces from the beginning and end of the string
$(date –date=@$(stat –format “%W” $src) “+%Y_%m_%d-%H-%M-%S”)_$dst generates the destination filename based on the date
cp copies the file

In general, we used a pattern to search the files and copied them to a different location. To sum up, this process can be helpful for taking backups or building archives.

6. Conclusion

In this article, we learned a few ways to rename a file based on the attributes of that file.

Firstly, we used the combined stat and date commands to get the file attributes. Secondly, we generated the new filename based on the attributes. Thirdly, we explored using pdfinfo command to extract the author of the PDF file. Similarly, we saw the usage of exiftool to retrieve metadata from a media file and apply it to the name of that file. Finally, we saw advanced usage of a pipeline of Bash commands for generating the file name.

We also explored combining all the above ideas into a script and renaming a collection of files based on their modification dates.

Persistence

REST

Security