1. Introduction
In this tutorial, we’ll see how to clone the entire file system hierarchy, after introducing all the basic concepts and precautions. Disk cloning generates a copy of one or more partitions or an entire disk.
Cloning tools can help upgrade a disk or replace an aging disk with a fresh one. We can also use it to create identical machines in virtual or physical environments. In the context of backup software, disk cloning is very similar to disk imaging, which aims to create a disk copy inside a disk image file.
Disk cloning can also serve for disaster recovery or forensics.
2. An Overview of Cloning
The two main cloning modes are “block-level cloning” and “file-level cloning”.
2.1. Disk-Level Cloning
Block-level cloning copies data by bypassing any interpretation of the logical filesystem structure and copying the drive’s internal block-level organization. It creates an exact one-to-one copy. It’s the more straightforward option in most cases, especially if the destination drive’s size is equal to or greater than the size of the source drive.
Block-level clones copy not only recognized files but also those that have been deleted, corrupted, or otherwise lost from the filesystem. That’s why it’s helpful for forensics.
Typical block-level cloning tools on Linux are dd and ddrescue. The latter is also helpful to recover lost and deleted data in Linux.
2.2. File-Level Cloning
File-level cloning is when the cloning tool copies data from one drive to another on a file-by-file basis, regardless of the logical filesystems and files’ physical positions on the source and destination drives. This type of cloning is convenient to change partitioning or file systems – for example, to clone from a complex partitioning based on LVM to a simpler one without LVM, or from a LUCKS-encrypted partition to an unencrypted one, or vice versa.
Furthermore, file-level cloning allows synchronizing of the clone with the original disk. It also implies that the copied files are defragmented compared to the original files.
rsync is a fast, versatile, local and remote file-copying and synchronization tool that can also serve as an excellent file-level cloning tool. Many tools rely on rsync. Timeshift uses rsync and hard links to take snapshots in an incremental approach.
3. Disk Partitions and Partitioning Tools
Disk partitioning is the creation of one or more parts on a disk so that we can manage each of them separately. These parts are called partitions. The disk stores the information about the partitions’ locations and sizes in an area known as the partition table. The standard partition table is GPT (GUID Partition Table).
Each partition has a UUID (Universally Unique Identifier), helpful in identifying them in /etc/fstab when other factors used to locate them might change. We can see the list of all UUIDs with blkid.
We need to note that GPT is a newer partitioning standard than MBR and doesn’t have many limitations. For example, the MBR standard only allows for four primary partitions per drive. It also doesn’t support drives larger than two terabytes. However, GPT lets us create hundreds of partitions per drive and supports drives larger than one billion terabytes.
The easiest way to manage partitions with Linux is GParted, a graphical partition editor usually included on many live CDs/DVDs. It is okay in simple cases. On the other hand, if we need to manipulate partitions from the command line, we can consult our “Partitioning Disks in Linux” guide.
4. Migrating an Operating System From One File System to Another Using rsync
Let’s see what precautions to follow, what rsync parameters to use, and what actions to perform after cloning.
4.1. General Rules in File-Level Cloning
Before we look in detail at the rsync options to use, there are some guidelines we should keep in mind.
Since cloning a running system can cause unpredictable failures and side effects, a live Linux system not installed on the source or destination drives is necessary to execute the cloning safely. We must use that live Linux system to mount both the source and destination partitions. This fact implies that if the destination disk is empty, we must create the GPT partition table and partition it.
If the source file system hierarchy includes mount points on different partitions, we should clone them separately. In many cases, /boot is on a small partition, so after cloning the root directory, we’ll have to run rsync a second time to clone /boot.
We must preserve permissions, keep file types, and avoid copying virtual pseudo file systems (/dev, /proc, and /sys). Cloning the swap partition or the swap file is always superfluous.
File-level cloning can convert between file systems – for instance, an ext3 file system to an ext4. The destination file system has to support all the functionalities of the source file system; otherwise, we can get unexpected side effects. For instance, cloning from ext4 to fat32 causes the loss of many features.
After the cloning, we have to modify /etc/fstab according to the new partitioning. Since file-level cloning doesn’t clone the MBR, so we also need to reinstall Grub.
If the machines use static IPs instead of DHCP, let’s remember to change the network card configuration of the cloned machine.
4.2. Best rsync Options to Clone
Supposing that /mnt/sourcePart is the mount point of the partition to be cloned, and /mnt/destPart that of the target partition, let’s proceed with the cloning. Note that this operation requires root permissions:
# rsync -axHAWXS --numeric-ids --info=progress2 /mnt/sourcePart/ /mnt/destPart
Let’s be careful to include the final slash in /mnt/sourcePart/. Otherwise, rsync will copy the parent directory instead of that directory’s content.
In detail:
- -a is the archive mode: it recurses into directories, copies symlinks as symlinks, and preserves permissions, owner, group, modification times, device files, and special files.
- -x doesn’t cross filesystem boundaries.
- -H preserves hard links.
- -A preserves ACLs.
- -W disables the delta-transfer algorithm used to reduce network usage. It’s a convenient way to boost speed when both the source and destination are local paths.
- -X updates the destination extended attributes to be the same as the source ones.
- -S tries to handle sparse files efficiently so that they take up less space on the destination.
- –numeric-ids uses numeric IDs instead of trying to map them. It’s notably needed for backups of jailed systems (BSD jails, OpenVZ, VServer, LXC) that appear to have bogus IDs when seen from their host system because they have their own ID maps.
- –info=progress2 outputs statistics based on the whole transfer, rather than individual files.
4.3. Fixing /etc/fstab
The /etc/fstab file typically lists all available disk partitions to be automatically mounted at boot time. After the cloning, we have to manually edit /etc/fstab to update the partitions’ UUIDs, which we can get using GParted or blkid. Generally, this is sufficient unless we need to add or remove partitions from the list, or change the file system types, to match the new configuration.
Let’s remember that the boot partition must have the boot flag. If it’s missing, we can add it with GParted.
4.4. Restoring Grub
Whatever the operating systems we cloned, the easiest way to restore Grub is boot-repair, a graphical tool available on live CDs/DVDs. It’s a one-click approach that works fine. We have just to click on the “Recommended repair” button. If the repair is unsuccessful, the software provides a detailed log and instructions for asking for help.
After the restore, we may want to change the default options, like removing the “quiet splash” options that are not useful in a server environment. We can manually edit the /etc/default/grub file in this case. After that, let’s remember to run update-grub to save the changes permanently.
5. Conclusion
In this tutorial, we’ve introduced the basic concepts of cloning and all the essential things to keep in mind. We’ve seen the best rsync parameters to perform file-level cloning.
Finally, we took care of editing /etc/fstab and restoring Grub to make the cloned operating system bootable and working.