1. Overview
The Linux rsync command-line utility enables us to copy and delete files on disk.
In this tutorial, we’ll see how to provide the command with options to copy only certain types of files.
2. rsync
Generally, we use rsync to synchronize source and destination directories on Linux filesystems. These directories are either local to our host or located on a remote host.
Ultimately, to achieve synchronization, the command copies only the portions of each file required either to or from one location to the next, depending on how we run it. For this reason, rsync is often the preferred way to transfer files because it moves a minimal amount of data. Consequently, this can make copying files to remote hosts very fast.
The rsync command distinguishes between simple string matching and wildcard matching patterns by examining whether the specified pattern contains any of the – ?, *, and [ wildcard characters.
3. Using the include Option
We can limit the files transferred by rsync with the include option. As its name implies, this option filters the files transferred and includes files based on the provided pattern. However, the include option only works along with the exclude option. This is because the default operation for rsync is to include everything in the source directory.
Let’s transfer only text files from the current directory, /source/path/here/, to the /destination/path/here/ directory:
$ rsync --include=*.txt --exclude=* /source/path/here/* /destination/path/here/
Here, the order of the options is important. Both options are filters and they are applied in order. First, we use a filter representing the files we want and then exclude everything else. When applied, all text files are selected for transfer, after which some files are excluded.
We can use the –include option multiple times with different filters:
$ rsync --include=*.txt --include=*.log --exclude=* /source/path/here/* /destination/path/here/
This command transfers all the text and log files from our source directory to the specified destination.
Finally, to copy all files that match in all our sub-directories we tell the command to recurse using the -a and -r options:
$ rsync -ar --include=*/ --include=*.txt --exclude=* /source/path/here/* /destination/path/here/
However, this has the potential of copying all the sub-directories (with no files), so we can also use the –prune-empty-dirs (-m) option to prevent this from occurring:
$ rsync -ar --prune-empty-dirs --include=*/ --include=*.txt --exclude=* /source/path/here/* /destination/path/here/
Thus, we avoid empty directories.
4. Using the filter Option
The include and exclude options are simplified forms of the filter (-f) option. The filter option enables more complex filtering rules, including both inclusion and exclusion patterns.
Each filtering rule specified with the –filter option consists of a prefix and a matching pattern. The prefix determines whether the rule is inclusive (+) or exclusive (–), while the matching pattern defines the criteria for the file or directory selection.
Let’s copy all files from a directory except one, file1.txt:
$ rsync -ar --filter '- file1.txt' /source/path/here/* /destination/path/here/
This operation employs the –filter option with the ‘- file1.txt’, where – specifies the exclusion of a specified pattern, in this case, a file named file1.txt.
When using the –filter option, the default for any unmatched file and directory is to be included in the synchronization.
Similar to the –include option, we can use as many –filter (-f) options as we need to build up a list of files to include or exclude:
$ rsync -ar -f'+ dir1/' -f'+ dir2/file.sh' -f'- *' /source/path/here/* /destination/path/here/
This command copies the dir1 directory and dir2/file.sh file while excluding the rest (‘- *’) from the source path.
We can clear the current filter rules by adding ! as a prefix in any pattern-matching rule.
Furthermore, we can also filter files based on a namespace. As a normal user, we can only copy files and directories from the user.* namespace. To copy files from non-user namespace, we need super-user privileges.
5. Other Options
We can use the –files-from option to specify the exact list of files to copy from the source directory.
In addition to taking the input from a specified file, the –files-from option also takes a standard input:
$ rsync -a --files-from=/tmp/new /source/directory/ /destination/directory/
This command copies all the files specified in the /tmp/new located in /source/directory.
Additionally, we can use the –include-from option with an input file containing include patterns (one per line). We can use the same prefixes as the –filter option to specify the inclusion and exclusion of files:
$ cat file/with/pattern/matching/rules
+ *.txt
+ *.c
- *
Let’s use this file with the –include-from option to copy only the files with .txt and .c extensions from the source directory:
$ rsync -ar --include-from=file/with/pattern/matching/rules /source/directory/* /destination/directory/
We can use the same file (file/with/pattern/matching/rules) with the –files-from option to obtain the same results.
The lines starting with either ; or # are considered comments and are ignored along with the blank lines.
6. Versions and Variants
This tutorial is based on version 3.2.7. The latest version of rsync is always available at https://rsync.samba.org.
The order of the include and exclude filter options, and how they are applied depends on the version. Earlier versions may not process files as described in this tutorial. However, versions after 3.0.5 should run as we expect.
7. Conclusion
In this article, we learned how to filter files while copying with the rsync command.
First, we used the include and exclude options to specify the inclusion and exclusion criteria for matching the files and directories. Then, we learned to copy files recursively with the –recursive (-r) option and to skip the empty directories from copying.
We also practiced the –filter option that provides more advanced filtering rules. Finally, we saw how to provide pattern-matching rules from an input file to the rsync command with the –files-from and –include-from options.
We can select any option depending on our preferences and needs.