rebuilding mdadm raid5

Question

I've dusted of a machine of which I want to repair an mdadm raid5 that I messed up. First the raid5 was 3 disks. A spare was added, just before one of the three started to fail. Spare one got used and failed disk was removed. Now months later, I cannot mount it correctly. The array is broken.

original build:

root# mdadm --create --metadata=1.0 --verbose /dev/md127 --chunk=512 --level=5 --raid-devices=3 /dev/sda1 /dev/sdb1 /dev/sdc1

current situation:

$ cat /proc/mdstat Personalities : md127 : inactive sdb1[1](S) sda1[0](S) 3677730784 blocks super 1.0 unused devices: <none>

mdadm -D /dev/md127

sudo mdadm -D /dev/md127 /dev/md127: Version : 1.0 Raid Level : __raid0__ Total Devices : 1 Persistence : Superblock is persistent State : inactive Name : nas:127 (local to host nas) UUID : 71da073c:d1928293:6947fa19:92d8a7bd Events : 1 Number Major Minor RaidDevice - 8 17 - /dev/sdb1

output of examine for each drive

$ sudo mdadm -E /dev/sd{b,c,e}1 **/dev/sdb1**: Magic : a92b4efc Version : 1.0 Feature Map : 0x1 Array UUID : 71da073c:d1928293:6947fa19:92d8a7bd Name : nas:127 (local to host nas) Creation Time : Sun Dec 10 23:26:56 2017 Raid Level : raid5 Raid Devices : 4 Avail Dev Size : 3677730784 (1753.68 GiB 1883.00 GB) Array Size : 5516594688 (5261.03 GiB 5648.99 GB) Used Dev Size : 3677729792 (1753.68 GiB 1883.00 GB) Super Offset : 3677730800 sectors Unused Space : before=0 sectors, after=992 sectors State : clean Device UUID : e1fdc3d2:b0f117a5:11856184:17db9522 Internal Bitmap : -16 sectors from superblock Update Time : Mon Dec 18 11:48:12 2017 Bad Block Log : 512 entries available at offset -8 sectors Checksum : 54a1b1a7 - correct Events : **1** Layout : left-symmetric Chunk Size : 512K Device Role : Active device 1 Array State : AA.. ('A' == active, '.' == missing, 'R' == replacing) **/dev/sdc1**: Magic : a92b4efc Version : 1.0 Feature Map : 0x1 Array UUID : a1498410:d13b2b4a:63379f8d:c821173f Name : fileserver:127 Creation Time : Mon Jan 19 15:35:41 2015 Raid Level : raid5 Raid Devices : 4 Avail Dev Size : 3677730536 (1753.68 GiB 1883.00 GB) Array Size : 5516594688 (5261.03 GiB 5648.99 GB) Used Dev Size : 3677729792 (1753.68 GiB 1883.00 GB) Super Offset : 3677730800 sectors Unused Space : before=0 sectors, after=992 sectors State : clean Device UUID : 472f7a29:679e1f18:87ee0d4c:88b2a62b Internal Bitmap : -16 sectors from superblock Update Time : Sun Dec 10 21:09:34 2017 Bad Block Log : 512 entries available at offset -8 sectors Checksum : 68dd142f - correct Events : **1934728** Layout : left-symmetric Chunk Size : 512K Device Role : Active device 1 Array State : AAAA ('A' == active, '.' == missing, 'R' == replacing) **/dev/sde1**: Magic : a92b4efc Version : 1.0 Feature Map : 0x1 Array UUID : 71da073c:d1928293:6947fa19:92d8a7bd Name : taknas:127 (local to host taknas) Creation Time : Sun Dec 10 23:26:56 2017 Raid Level : raid5 Raid Devices : 4 Avail Dev Size : 3677730784 (1753.68 GiB 1883.00 GB) Array Size : 5516594688 (5261.03 GiB 5648.99 GB) Used Dev Size : 3677729792 (1753.68 GiB 1883.00 GB) Super Offset : 3677730800 sectors Unused Space : before=0 sectors, after=992 sectors State : clean Device UUID : ebd3b12c:975c1a0b:4653f1ed:e9788e37 Internal Bitmap : -16 sectors from superblock Update Time : Mon Dec 18 11:48:12 2017 Bad Block Log : 512 entries available at offset -8 sectors Checksum : 931a5e9d - correct Events : **1** Layout : left-symmetric Chunk Size : 512K Device Role : Active device 0 Array State : AA.. ('A' == active, '.' == missing, 'R' == replacing)

thinking out loud:

It looks like the raid5 failed, went into raid0 with 2 disk, while sda1 isn't there. And in spare mode.
Also, the events counter reset for sdb1 and sde1. sdb1 seems to be in another array. I may have forgotten to remove the failing disk from the array correctly, as it thinks it consists of 4.

Not sure what to do here to repair the raid array and keep data intact.

Josip Rodin · Accepted Answer · 2018-06-10 19:36:33Z

There's probably something in sudo dmesg about sdb1 being assembled as part of a raid0 md127. That's curious, you should probably examine it.

Either way, that dysfunctional array needs to be stopped because it's hogging sdb1 now:

sudo mdadm --stop /dev/md127

Then try assembling it with what seem to be the right two out of three:

sudo mdadm --assemble /dev/md127 /dev/sdb1 /dev/sde1 --verbose

If that works out, then add the odd one:

sudo mdadm /dev/md127 --add /dev/sdc1

Wasn't able to find why it became part of raid0. I'm guessing because one drive went missing? I was able to reassamble first with just the 2, and add the 3rd. But I'm unable to mount it, as it gives the following error: $ sudo mount /dev/md127 /mnt/raid5/ mount: /mnt/raid5: mount(2) system call failed: File too large. (cant seem to get code block in comment) — nieweling
– nieweling, Commented Jun 13, 2018 at 23:47
@niewelimg weird, is there more detail in dmesg about that failure? — Josip Rodin
– Josip Rodin, Commented Jun 15, 2018 at 5:43

Stack Exchange Network

rebuilding mdadm raid5

original build:

current situation:

thinking out loud:

1 Answer 1

You must log in to answer this question.

Hot Network Questions

rebuilding mdadm raid5

original build:

current situation:

thinking out loud:

1 Answer 1

You must log in to answer this question.

Related

Hot Network Questions