1. Overview
Redundant Array of Independent Disks (RAID) configurations are widely used in Linux systems to provide redundancy and improve performance. Moreover, RAID is a technology that combines multiple physical hard drives into a single logical unit called a RAID array to enhance data redundancy, performance, or a combination of both.
However, sometimes we might encounter situations where our RAID array undergoes an automatic resynchronization, commonly known as resync.
In this tutorial, we’ll explore the reasons behind this phenomenon and discuss various methods to manage and disable automatic resyncs.
2. Automatic RAID Resynchronization Reasons
RAID resync is a process that ensures data integrity and redundancy in the event of disk failures or system crashes. Moreover, it rebuilds and synchronizes data across the RAID array to maintain redundancy and data consistency.
There are many reasons that trigger a resync operation:
- disk replacement
- system reboot
- scheduled checks
In this section, we’ll explore these reasons in detail.
2.1. Disk Replacement
A prevalent scenario for an automatic resync is when a new disk integrates into the RAID array. This typically transpires when we replace a failed disk with a new one.
Let’s check the status of a RAID array:
$ sudo mdadm --detail /dev/md0
To grant us temporary superuser privileges for executing the subsequent command, we use the sudo command. Next, we utilize the mdadm command that manages and monitors the software RAID arrays in Linux. Finally, the –detail option expands the output with extra details.
Let’s have a closer look at the output:
...
Number Major Minor RaidDevice State
0 8 2 0 active sync /dev/sda2
1 8 18 1 active sync /dev/sdb2
2 8 34 2 active resync /dev/sdc2
In this example, we observe that three devices, sda, sdb, and sdc, are active in the RAID array. Under the State column, we might see one of the devices showing resync as its state. For example, unlike sda and sdb, sdc is resyncing.
2.2. System Reboot
Following a system reboot, the RAID array may undergo an automatic resync to ensure data consistency.
Let’s monitor the resync status:
$ cat /proc/mdstat
We use the cat command to display the content of the /proc/mdstat file from the /proc pseudo-filesystem. In essence, this file holds valuable information regarding active software RAIDs.
Next, we move on to the output of the above command:
md0 : active raid1 sdb2[1] sda2[0]
1953511936 blocks super 1.2 [2/2] [UU]
[==>..................] resync = 12.5% (244618752/1953511936) finish=149.0min speed=76524K/sec
bitmap: 15/15 pages [60KB], 65536KB chunk
Here, the output provides detailed information about the RAID array called md0. In particular, it indicates that the resync is 12.5% complete, with a projected finish time of 149.0 minutes, and it’s currently operating at a speed of 76,524 KB/sec.
2.3. Scheduled Checks
Aside from hardware changes, scheduled tasks like cron jobs and systemd timers can trigger resyncs.
For example, the mdcheck tool performs periodic checks to ensure data integrity:
$ systemctl start mdcheck_start
The above code snippet can lead to a resync operation using the systemctl command, which might take an extended period, especially with large amounts of data.
3. Managing RAID resync
In this section, we’ll understand how to manage RAID resync operations by using mdadm.
3.1. Forcing resync
Sometimes, there may be instances when we need to manually force the resync process.
Accordingly, we achieve this by using the mdadm tool:
$ sudo mdadm --stop /dev/md0
mdadm: stopped /dev/md0
In particular, –stop deactivates or stops a running RAID array. Thus, the specified array is deactivated. Moreover, this means it’s no longer accessible or actively used by the system. Finally, the underlying devices that make up the array return to their normal, individual states.
Now, let’s force a manual resync:
$ sudo mdadm --assemble --run --force --update=resync /dev/md0 /dev/sda2 /dev/sdb2 /dev/sdb3
mdadm: /dev/mdN has been started with 3 drives.
In this example, we started the RAID array again after forcing a resync with three (3) drives:
- –assemble the RAID array
- –run starts the array if it’s not already running or forces the array to be operational
- –force ignores potential issues and forcibly assembles the array
- after array assembly, –update=resync performs a resync operation to ensure data consistency and redundancy
Finally, let’s validate the above operations:
$ cat /proc/mdstat
[===>.................] resync = 89.7% (414656/2095040) finish=0.2min speed=138218K/sec
The above message indicates that the RAID array is resynchronizing data to ensure integrity and redundancy.
3.2. Speeding Up resync
On the contrary, if we want to expedite the resync process, we increase the resync speed:
$ sudo echo 100000 > /proc/sys/dev/raid/speed_limit_min
$ sudo echo 200000 > /proc/sys/dev/raid/speed_limit_max
Here, we set the minimum speed limit to 100000 KB/s and the maximum as 200000 KB/s. This can vary from one environment to the other based on the capabilities.
3.3. Disabling Through mdadm
Furthermore, we can also disable resync using mdadm by setting the write-intent bitmap to none:
$ sudo mdadm --grow --bitmap=none /dev/md0
This command disables the write-intent bitmap, which tracks regions of the array that need resyncing.
3.4. Stopping mdcheck
To stop the timer responsible for automatic resyncs, we’ll need to identify the specific timer unit associated with RAID checks:
$ systemctl list-timers
NEXT LEFT LAST PASSED UNIT ACTIVATES
Mon 2023-10-31 02:22:14 UTC 1h 3min left Sun 2023-10-30 02:22:14 UTC 22h ago systemd-tmpfiles-clean.timer systemd-tmpfiles-clean.service
Sun 2023-11-05 22:17:28 PDT 2 weeks 5 days left Sun 2023-06-05 21:31:43 PDT 1 day 10h ago mdcheck_start.timer mdcheck_start.service
Next, stopping the timer halts scheduled RAID checks, which include potential resync operations.
In essence, this should be done with caution and only if we have a specific reason for doing so:
$ sudo systemctl stop mdcheck_start.timer
$ systemctl list-timers
NEXT LEFT LAST PASSED UNIT ACTIVATES
Mon 2023-10-31 02:22:14 UTC 1h 3min left Sun 2023-10-30 02:22:14 UTC 22h ago systemd-tmpfiles-clean.timer systemd-tmpfiles-clean.service
In the above snippet, we stopped the mdcheck_start timer. Then, we validated that it no longer exists in the list of timers.
Disabling the timer means that scheduled consistency checks, which ensure data integrity, no longer occur. In addition, this could lead to undetected data inconsistencies or errors resulting in potential data loss.
4. resync Theory
To ensure issues are identified and addressed on time, mdcheck is usually scheduled to run at least once a month.
While consistency checks are essential for data safety, it’s important to note that they’re different from resync operations. resync involves the reconstruction and synchronization of data on the RAID array. They’re triggered by hardware changes or system events. Moreover, this is a safety measure to prevent potential data loss.
Finally, regular checks and monitoring of the status of our RAID array are often good practices for keeping our data safe.
5. Conclusion
In this article, we got a comprehensive overview of automatic RAID resyncs.
We learned that while it’s possible to manage and even disable resync processes, it’s good to have proper backups in place before making significant changes to the RAID configuration.
Finally, we saw that each method serves a specific purpose, and it’s important to select the most appropriate one for our particular scenario.