1. Overview
When we connect a USB drive to a Linux system, it becomes an integral part of the filesystem hierarchy. This integration allows seamless data transfer between the device and the computer. However, there are pitfalls. For example, if we physically disconnect or turn off a drive without properly unmounting it first, we may lose some data or even corrupt the entire filesystem.
In this tutorial, we’ll explore the basic steps and precautions necessary to safely remove USB drives using the Linux command line.
2. Checking Device Usage and Unmount
Let’s take the simple case of copying files. By default, Windows uses write-through with USB devices, writing data directly to the device without significant delay. Linux and MacOS, on the other hand, use a write-back policy to make extensive use of caching to improve overall performance. That’s why data can remain in the cache instead of being written to the device immediately. As a result, even when the copy process appears to be complete, the data may not be fully transferred to the USB device. In such a scenario, a power failure, crash, or physical disconnection of the device can result in data loss or corruption.
2.1. Identifying the USB Device Using df and lsblk
The df command provides information about the space usage of mounted filesystems. With the -h option, it displays all mounted filesystems along with their size in human-readable format, used space, available space, and mount point.
Let’s look for the device corresponding to our USB drive:
$ df -h
Filesystem Size Used Avail Use% Mounted on
[...]
/dev/dm-3 917G 716G 192G 79% /media/francesco/106bfc11-23d5-49c1-8c10-953cbb082a14
In this example, /dev/dm-3 is our USB device. The dm in dm-3 stands for device mapper, and the number that follows is a sequential identifier. dm-3 doesn’t refer directly to physical drives, like sda or sdb, but rather to a virtual block device that the system uses to handle complex disk operations. We often see dm-X in systems using LVM, encrypted volumes, or other sophisticated storage solutions.
Using lsblk followed by optional arguments such as NAME, KNAME and others, we can see more information clearly:
$ lsblk -o NAME,KNAME,FSTYPE,TYPE,MOUNTPOINT,SIZE
NAME KNAME FSTYPE TYPE MOUNTPOINT SIZE
[...]
sdc sdc disk 931,5G
└─sdc1 sdc1 crypto_LUKS part 931,5G
└─luks-d99ee6e1-7262-4267-ac15-b93674b9f666 dm-3 ext4 crypt /media/francesco/106bfc11-23d5-49c1-8c10-953cbb082a14 931,5G
In this output, KNAME, which stands for “kernel name”, and NAME refer to the same device. So our USB disk has two equivalent device names:
- /dev/dm-3
- /dev/mapper/luks-d99ee6e1-7262-4267-ac15-b93674b9f666
We can easily verify that the second device is a symbolic link to the first:
$ ls -l /dev/dm-3 /dev/mapper/luks-d99ee6e1-7262-4267-ac15-b93674b9f666
brw-rw---- 1 root disk 253, 3 May 1 22:26 /dev/dm-3
lrwxrwxrwx 1 root root 7 May 1 22:26 /dev/mapper/luks-d99ee6e1-7262-4267-ac15-b93674b9f666 -> ../dm-3
We can use whichever one we prefer.
2.2. Checking for Active Usage of the Device
iostat is a tool for monitoring the load on I/O devices. Let’s print the I/O statistics of /dev/dm-3 every two seconds for five times:
$ iostat -d /dev/dm-3 2 5 | { head -3 ; grep dm-3 ; }
[...]
Device tps kB_read/s kB_wrtn/s kB_dscd/s kB_read kB_wrtn kB_dscd
dm-3 181,50 72540,00 32,00 0,00 145080 64 0
dm-3 270,00 75490,00 0,00 0,00 150980 0 0
dm-3 227,00 76104,00 6,00 0,00 152208 12 0
dm-3 131,00 68398,00 20480,00 0,00 136796 40960 0
In this case, we notice that the disk is reading and writing data because of the high kB_read/s and kB_wrtn/s speeds.
While iostat helps us understand device activity, lsof allows us to identify specific processes that have files open on the /dev/dm-3 mount point, i.e., /media/francesco/106bfc11-23d5-49c1-8c10-953cbb082a14:
$ lsof | { head -1 ; grep /media/francesco/106bfc11-23d5-49c1-8c10-953cbb082a14 ; }
COMMAND PID [...] USER FD [...] NAME
nemo 393849 [...] francesco 23r [...] /media/francesco/[...]/file1.7z
nemo 393849 [...] francesco 24w [...] /media/francesco/[...]/file2.7z
[...]
This output means that the nemo command, which is the default file manager in Cinnamon and GNOME, is reading file1.7z and writing file2.7z. In this case, it’s just finishing a file copy.
When it’s done, we can check the device I/O statistics again:
$ iostat -d /dev/dm-3 2 5 | { head -3 ; grep dm-3 ; }
Linux 5.15.0-105-generic (asusrog) 05/01/2024 _x86_64_ (8 CPU)
Device tps kB_read/s kB_wrtn/s kB_dscd/s kB_read kB_wrtn kB_dscd
dm-3 27,29 4606,04 3048,04 0,00 449510259 297463396 0
dm-3 66,50 0,00 264,00 0,00 0 528 0
dm-3 0,00 0,00 0,00 0,00 0 0 0
dm-3 0,00 0,00 0,00 0,00 0 0 0
dm-3 0,00 0,00 0,00 0,00 0 0 0
The last three reads of kB_read/s and kB_wrtn/s are 0, so no other processes are using the device and we can safely unmount the USB drive.
2.3. Safely Unmounting the USB Drive
Using the sync command before umount is a recommended practice, although not always strictly necessary. sync forces the system to write all unused data buffers to the drive, ensuring that all pending operations are completed before unmounting the drive. This is especially important to prevent data loss:
$ sync
sync produces no output, but it doesn’t terminate until all operations on all disks have been completed. Therefore, sync may exit immediately or after a few seconds. If it doesn’t exit at all, we need to investigate as in the previous steps.
Once the sync operation is complete, let’s unmount the USB device. We can use the umount command followed by the mount point or by one of the two equivalent device names we identified earlier:
$ umount /dev/dm-3
umount detaches the storage device’s filesystem from the computer’s main filesystem, making it safe to physically remove the device. However, as an extra precaution, let’s wait a few seconds to make sure that the disk LED shows no activity.
Finally, it’s worth noting that depending on the configuration of our Linux machine, some of the commands described so far may require root privileges.
3. Conclusion
In this article, we’ve learned the critical steps and precautions for safely removing USB drives under Linux. We’ve seen how to verify device usage and perform proper unmounting procedures to prevent potential data loss or corruption.
It’s important to remember that the integrity of our data depends heavily on strict adherence to these best practices. In addition, we should take every precaution to avoid physical damage, shock to the disks during use, and power outages.