I have a debian host configured as a NAS using 6 disks in a RAID 5 setup. The current configuration is as follows:
# mdadm -D /dev/md0 /dev/md0: Version : 1.2 Creation Time : Fri Mar 12 11:42:23 2021 Raid Level : raid5 Array Size : 19534424640 (18.19 TiB 20.00 TB) Used Dev Size : 3906884928 (3.64 TiB 4.00 TB) Raid Devices : 6 Total Devices : 6 Persistence : Superblock is persistent Intent Bitmap : Internal Update Time : Sat Jan 18 17:44:06 2025 State : clean Active Devices : 6 Working Devices : 6 Failed Devices : 0 Spare Devices : 0 Layout : left-symmetric Chunk Size : 64K Consistency Policy : bitmap Name : data:0 UUID : 2265a382:cb20817f:de0f543b:a830605c Events : 547472 Number Major Minor RaidDevice State 9 8 33 0 active sync /dev/sdc1 8 8 17 1 active sync /dev/sdb1 10 8 81 2 active sync /dev/sdf1 11 8 97 3 active sync /dev/sdg1 6 8 65 4 active sync /dev/sde1 7 8 49 5 active sync /dev/sdd1 sdb and sdd are 8 TB disks, all other RAID members are 4 TB. I now want to replace the four 4TB disks with new 16TB disks, convert the current RAID5 setup to RAID6 and grow the used device size to 8TB (the new maximum, until I can replace the remaining two 8 TB disks with 16TB ones).
I am now looking for procedure to safely to this without data loss and as hassle free as possible. For the duration of the procedure downtime is acceptable, data loss is not. Since all SATA slots are currently in use, I cannot add the new disks while the old ones are still online. I will have to replace each one after another. As such, I think it would be sensible to convert the existing RAID5 to RAID6 first, and then swap out the disks one by one. This would add another layer of redundancy during the rebuild process.
After looking around online, the following is the procedure I came up with. Can somebody confirm that this is the most sensible way to go about this, or are there steps I am missing / are there easier ways to achieve this, given my constraints (in-place upgrade).
My current plan:
- backup all data from /mnt/md0
- verify backup integrity
- unmount /mnt/md0
- shrink filesystem on /dev/md0 to the minimum possible size, see https://access.redhat.com/articles/1196333 for procedure
e2fsck -f /dev/md0check filesystem, -f forces check even if cleanresize2fs -P /dev/md0estimate minimum sizeresize2fs -p -M /dev/md0shrink to minimum size (-M) and print progress (-p)e2fsck -f /dev/md0check filesystem again to make sure it's clean
- check actual new size of filesystem:
dumpe2fs -h /dev/md0 |& awk -F: '/Block count/{count=$2} /Block size/{size=$2} END{print count*size}' - fail one of the 8TB disks in the RAID5 array:
we fail an 8TB disk because this guarantees that mdadm won't decide that the drive is too small (for some reason) when we re-add it latermdadm /dev/md0 --fail /dev/sdd - Estimate the new size of the RAID5 array, by attempting to run this command and checking the error message:
mdadm --grow /dev/md0 --raid-devices=5 - Verify the filesystem is small enough to fit. Then shrink the block device:
mdadm --grow /dev/md0 --array-size [new_size] - shrink RAID5 array from 6 to 5 disks
mdadm --grow /dev/md0 --raid-devices=5 --backup-file=/root/md0_raid5_shrink.bak - wait for RAID5 to finish rebuilding
- add the removed disk as a hotspare
mdadm --add /dev/md0 /dev/sdd - grow RAID5 array to RAID6 with 6 disks
mdadm --grow /dev/md0 --raid-devices 6 --level 6 --backup-file=/root/md0_raid5_to_raid6.bak - wait for RAID6 to finish rebuilding
- replace every 4TB disk one by one with 16TB disks, waiting for the RAID6 to finish rebuilding each time, this should allow us to keep redundancy during the migration
mdadm --fail /dev/md0 /dev/sdXmdadm --remove /dev/md0 /dev/sdX- shutdown and replace disk
mdadm --add /dev/md0 /dev/sdX- wait for RAID6 to finish rebuilding
- grow RAID6 array to max size (capped by the two 8TB disks)
mdadm --grow /dev/md0 --size=max - grow filesystem on /dev/md0 to max size
resize2fs /dev/md0 - remount /mnt/md0
My questions are as follows:
- Is this the recommended way of upgrading a RAID5 array to RAID6?
- Since I want to avoid asking a yes/no question: If my procedure makes sense, are there ways I can improve it to avoid the risk of data loss / having to restore from backup? Is there a faster way of doing this?
- I have an autogenerated config file in
/etc/mdadm/mdadm.conf, will I have to change that in any way, will that happen automatically, or is that unrelated to the whole process?
Some more context / other information:
- The filesystem on /dev/md0 is ext4.
- The system root / is on /dev/sda, which is not impacted by the migration
- Most guides talk about adding a new spare first and then migrating from RAID5 to RAID6. This is not (easily) possible in this case because all SATA slots are already in use.