procedure for mdadm in-place RAID5 to RAID6 upgrade

Question

I have a debian host configured as a NAS using 6 disks in a RAID 5 setup. The current configuration is as follows:

# mdadm -D /dev/md0 /dev/md0: Version : 1.2 Creation Time : Fri Mar 12 11:42:23 2021 Raid Level : raid5 Array Size : 19534424640 (18.19 TiB 20.00 TB) Used Dev Size : 3906884928 (3.64 TiB 4.00 TB) Raid Devices : 6 Total Devices : 6 Persistence : Superblock is persistent Intent Bitmap : Internal Update Time : Sat Jan 18 17:44:06 2025 State : clean Active Devices : 6 Working Devices : 6 Failed Devices : 0 Spare Devices : 0 Layout : left-symmetric Chunk Size : 64K Consistency Policy : bitmap Name : data:0 UUID : 2265a382:cb20817f:de0f543b:a830605c Events : 547472 Number Major Minor RaidDevice State 9 8 33 0 active sync /dev/sdc1 8 8 17 1 active sync /dev/sdb1 10 8 81 2 active sync /dev/sdf1 11 8 97 3 active sync /dev/sdg1 6 8 65 4 active sync /dev/sde1 7 8 49 5 active sync /dev/sdd1

sdb and sdd are 8 TB disks, all other RAID members are 4 TB. I now want to replace the four 4TB disks with new 16TB disks, convert the current RAID5 setup to RAID6 and grow the used device size to 8TB (the new maximum, until I can replace the remaining two 8 TB disks with 16TB ones).

I am now looking for procedure to safely to this without data loss and as hassle free as possible. For the duration of the procedure downtime is acceptable, data loss is not. Since all SATA slots are currently in use, I cannot add the new disks while the old ones are still online. I will have to replace each one after another. As such, I think it would be sensible to convert the existing RAID5 to RAID6 first, and then swap out the disks one by one. This would add another layer of redundancy during the rebuild process.

After looking around online, the following is the procedure I came up with. Can somebody confirm that this is the most sensible way to go about this, or are there steps I am missing / are there easier ways to achieve this, given my constraints (in-place upgrade).

My current plan:

backup all data from /mnt/md0
verify backup integrity
unmount /mnt/md0
shrink filesystem on /dev/md0 to the minimum possible size, see https://access.redhat.com/articles/1196333 for procedure
1. e2fsck -f /dev/md0 check filesystem, -f forces check even if clean
2. resize2fs -P /dev/md0 estimate minimum size
3. resize2fs -p -M /dev/md0 shrink to minimum size (-M) and print progress (-p)
4. e2fsck -f /dev/md0 check filesystem again to make sure it's clean
check actual new size of filesystem: dumpe2fs -h /dev/md0 |& awk -F: '/Block count/{count=$2} /Block size/{size=$2} END{print count*size}'
fail one of the 8TB disks in the RAID5 array:
```
mdadm /dev/md0 --fail /dev/sdd 
```
we fail an 8TB disk because this guarantees that mdadm won't decide that the drive is too small (for some reason) when we re-add it later
Estimate the new size of the RAID5 array, by attempting to run this command and checking the error message:

mdadm --grow /dev/md0 --raid-devices=5

Verify the filesystem is small enough to fit. Then shrink the block device:

mdadm --grow /dev/md0 --array-size [new_size]

shrink RAID5 array from 6 to 5 disks

mdadm --grow /dev/md0 --raid-devices=5 --backup-file=/root/md0_raid5_shrink.bak

wait for RAID5 to finish rebuilding
add the removed disk as a hotspare

mdadm --add /dev/md0 /dev/sdd

grow RAID5 array to RAID6 with 6 disks

mdadm --grow /dev/md0 --raid-devices 6 --level 6 --backup-file=/root/md0_raid5_to_raid6.bak

wait for RAID6 to finish rebuilding
replace every 4TB disk one by one with 16TB disks, waiting for the RAID6 to finish rebuilding each time, this should allow us to keep redundancy during the migration
1. mdadm --fail /dev/md0 /dev/sdX
2. mdadm --remove /dev/md0 /dev/sdX
3. shutdown and replace disk
4. mdadm --add /dev/md0 /dev/sdX
5. wait for RAID6 to finish rebuilding
grow RAID6 array to max size (capped by the two 8TB disks)

mdadm --grow /dev/md0 --size=max

grow filesystem on /dev/md0 to max size

resize2fs /dev/md0

remount /mnt/md0

My questions are as follows:

Is this the recommended way of upgrading a RAID5 array to RAID6?
Since I want to avoid asking a yes/no question: If my procedure makes sense, are there ways I can improve it to avoid the risk of data loss / having to restore from backup? Is there a faster way of doing this?
I have an autogenerated config file in /etc/mdadm/mdadm.conf, will I have to change that in any way, will that happen automatically, or is that unrelated to the whole process?

Some more context / other information:

The filesystem on /dev/md0 is ext4.
The system root / is on /dev/sda, which is not impacted by the migration
Most guides talk about adding a new spare first and then migrating from RAID5 to RAID6. This is not (easily) possible in this case because all SATA slots are already in use.

The best practice is to create a backup of the data, which you should already have anyway, and then delete the RAID 5 array and create a RAID 6 array. — Nasir Riley
– Nasir Riley, Commented Jan 18 at 17:32

Lutz Willek · Accepted Answer · 2025-01-24 08:16:25Z

The answer "according to the book": Make and verify a backup, destroy your current raid, create the new raid exact as needed, restore, call it a day. Most likely this is the faster approach too, when considering the resync times.

The answer to your question: No, you do not have to manually adapt mdadm.conf when using mdadm.

"Since all SATA slots are currently in use, I cannot add the new disks while the old ones are still online."

I do not question the fact, but the conclusion. As mdadm is flexible and can manage arrays that include disks connected via different interfaces. This allows you to use temporary external storage (such as USB drives) to assist with the disk replacement process, even when all internal SATA slots are currently in use. So you have options.

Regarding your current plan: Overall Nahhhh. As said in the beginning, I would not do it this way in general. But my concern away for a minute and assuming the approach as described by you, then...

Point 1 and 2 (first backup and then verify): Very good, always the correct start. Approved.
Point 2,3,4: (unmount and shrink the filesystem): I think this steps are not necessary, as mdadm operates independent of the fs. The RAID rebuild process deals with blocks of data and parity, irrespective of how the fs is organized on top of those blocks. If you have a proper backup and your RAID array is in a stable state, you can proceed with replacing the disks one by one without shrinking or touching the fs. So you can skip this steps.
Point 5: Never a fail, without a remove. Why? As this is important to ensure that mdadm understands that a disk is no longer part of the array, and it should no longer expect to use it for data storage or redundancy. Corrected:

mdadm --manage /dev/md0 --fail /dev/sdd1 mdadm --manage /dev/md0 --remove /dev/sd1

Point 7: No. Reducing the existing raid will fail. The --grow option can be used to increase the number of devices, but not shrinking. As far as I know, this is not supported by mdadm. (Please correct me someone if I am wrong on this point)

In case you really really really want to do an in-place replace, then:

Before you start, please google yourself the different rebuild times of Raid5 and Raid6, then do a quick calculation to decide if you really want to invest this much time. (You have been warned)

In case you still want to do it this way, and you accept the rebuild times:

Make a backup
Verify your backup
Store the backup in a place which ensures that you will find it later, but so far away that the backup disk is not mixed up by accident with one of the other disks in the following procedure. (I warn out of my own experience here)

Then repeat the following procedure for disk sd[c|e|f|g], basically to replace every 4TB disk one by one with 16TB disks:

(Replace sdX1 with sdc1 in the first run, with sde1 in the second run, and so on...)

Fail and remove the disk.

mdadm --manage /dev/md0 --fail /dev/sdX1 mdadm --manage /dev/md0 --remove /dev/sdX1

Power off, replace the old 4 TB disk sdX with the new 16 TB variant, power on.
Create a partition on the new disk:

parted /dev/sdX mklabel gpt parted /dev/sdX mkpart primary 0% 100%

Add the new disk to the array:

mdadm --manage /dev/md0 --add /dev/sdX1

Monitor the rebuilding process

watch cat /proc/mdstat

Wait until the rebuild is complete, before proceeding to the next disk. Continus with the next steps once all 4TB disks have been replaced with 16TB disks and the array is rebuilt.

Convert the RAID 5 to RAID 6 and use the full capacity of the new disks. As you do not have spares to avoid a degraded array during conversion, you have to force this conversion step to happen.

mdadm --grow /dev/md0 --level=6 --force mdadm --grow /dev/md0 --size=max

To be very clear and precise on this raid conversion point, in order to set the right expectations: This conversion from Raid5 to Raid6 will work technically just fine, and you will end up in a clean Raid6 state too. But still, the Raid6 will stay in degraded state, until you add a spare. I understood that you can not add a disk as spare, as all seats are full already in your system. So this is maybe (most likely) not what you want to reach.

Resize the Filesystem. You said you are using ext4:

resize2fs /dev/md0

HINT HINT HINT: Test out ideas in a safe space before proceeding with real data. mdadm is flexible and supports any type of block device, including virtual ones created from files.

This flexibility allows you to test any procedure in a controlled environment before implementing on real data. Here is how to replicate your current setup for your testing:

# Create a test directory mkdir -p /root/my_raid_tests # Create files to represent block devices, associate loop devices with these files for i in {1..6}; do dd if=/dev/zero of=/root/my_raid_tests/file$i.img bs=1M count=100 losetup /dev/loop$i /root/my_raid_tests/file$i.img done # Create the RAID 5 mdadm --create /dev/md/my_raid_tests --level=5 --raid-devices=6 --layout=left-symmetric --chunk=64K /dev/loop{1..6} # Verify the array mdadm --detail /dev/md/my_raid_tests # Create a filesystem on the array mkfs.ext4 /dev/md/my_raid_tests # Create a mount point and mount the array mkdir -p /mnt/my_raid_tests mount /dev/md/my_raid_tests /mnt/my_raid_tests

And here is how to get rid of it:

umount /mnt/my_raid_tests mdadm --stop /dev/md/my_raid_tests mdadm --zero-superblock /dev/loop{1..6} for i in {1..6}; do losetup -d /dev/loop$i done rm -i /root/my_raid_tests/file{1..6}.img rmdir /root/my_raid_tests /mnt/my_raid_tests

Thank you for the detailed response! The idea with the virtual devices worked great to test my procedure, and it seems to have worked (after properly removing the device as you specified). I still reduced the file system first to circumvent the problem described here: unix.stackexchange.com/questions/391168. The --grow option can be used to increase the number of devices, but not shrinking. this is wrong. From the mdadm man page: --grow - Grow (or shrink) an array, or otherwise reshape it in some way. and I can confirm that everything worked without data loss in testing. — Frederik Hoeft
– Frederik Hoeft, Commented Jan 25 at 21:06
That being said, you were right in that it would have taken forever to rebuild, so for the real system I did end up just throwing the previous raid away and re-creating it with the new disks and raid level :D But it's good to know for the future that an in-place upgrade would have worked as well -- at least with the virtual devices it did. — Frederik Hoeft
– Frederik Hoeft, Commented Jan 25 at 21:09

Stack Exchange Network

procedure for mdadm in-place RAID5 to RAID6 upgrade

1 Answer 1

You must log in to answer this question.

Linked

Hot Network Questions

procedure for mdadm in-place RAID5 to RAID6 upgrade

1 Answer 1

You must log in to answer this question.

Linked

Related

Hot Network Questions