1

I have a serious issue with one of my Oracle Linux 8 servers in a 4-node rack setup. Each node has an identical LVM + RAID1 layout, but one node is now unbootable after i accidentally ran parted and wipefs on its main RAID device.

Here’s the background:

The system had two RAID1 arrays:

  • /dev/md25 (main array, 3.5T, contains all LVM partitions)

  • /dev/md26 (backup mirror, also 3.5T)

Unfortunately, some commands like parted and wipefs were mistakenly executed on /dev/md25.

After that, the system failed to boot, and I had to boot into emergency mode using the Oracle Linux 8 ISO.

Now I see the following output in emergency mode:

# lsblk /dev/md25 NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT md25 9:25 0 3.5T 0 raid1 # lsblk /dev/md26 NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT md26 9:26 0 3.5T 0 raid1 # pvs WARNING: Couldn't find device with uuid 3N...KHSegf. WARNING: VG VGExaDb is missing PV 3N...KHSegf PV VG Fmt Attr PSize PFree /dev/md26 VGExaDb lvm2 a-- 3.49t 3.49t [unknown] VGExaDb lvm2 a-m 3.48t 3.21t # vgs VG #PV #LV #SN Attr VSize VFree VGExaDb 2 11 0 wz--n- 6.98t 6.70t # lvs WARNING: VG VGExaDb is missing PV 3N...KHSegf LVDbHome VGExaDb -wi------- 4.00g LVDbOral VGExaDb -wi-------200.00g LVDbSwap1 VGExaDb -wi------- 16.00g LVDbSys1 VGExaDb -wi------- 15.00g LVDbSys2 VGExaDb -wi------- 30.00g LVDbTmp VGExaDb -wi------- 3.00g LVDbVar1 VGExaDb -wi------- 2.00g LVDbVarLog VGExaDb -wi------- 2.00g LVDbVarLogAudit VGExaDb -wi------- 1.00g LVDbNotRemoveOrUse VGExaDb -wi----- 2.00g 

At this point, /dev/md25 shows no partitions (most likely because its metadata was damaged by wipefs or parted command).

the final command run :

mdadm --detail --scan >> /etc/mdadm.conf 
HOMEHOST <ignore> ARRAY /dev/md/26 metadata=1.2 name=exadb04.srv:26 UUID=88b.. ARRAY /dev/md/24 metadata=1.2 name:localhost:24 UUID=b282b... ARRAY /dev/md/25 metadata=1.2 name:localhost:25 UUID=9671bc... 

shows me raids /dev/md/.. not /dev/mdx !

Now I have a few questions:

  1. Is there any way to recreate or recover the missing PV without losing data?
  2. Since this server is part of a 4-node rack, and the other nodes have an identical LVM layout (same VG/LV structure),
  3. Can I use the LVM archive/backup metadata from another healthy node to restore the damaged VG on this one?

Any guidance or recovery steps would be greatly appreciated.

8
  • /dev/md25 shows no partitions (most likely because its metadata was damaged It doesn't seem like it should have any partition anyway though, as apparently you have for reasons been using RAID arrays (unpartitioned) as PVs (for a single VG). backup mirror I'm not sure how it is that. Do you actually know about the setup? Or are you really taking over the sysadmin job from someone else or so? Commented Nov 7 at 8:12
  • If there are clone(s) of the VG on some other machines, you might want to try vgcfgbackup and vgcfgrestore (with pvcreate). Commented Nov 7 at 8:14
  • Ahh, it seems that every LV in the VG is a RAID1 as well...(so two layers of RAID1...) Commented Nov 7 at 8:20
  • I checked another identical server in the same rack for comparison. On that system, /dev/md25 shows partitions as expected, but /dev/md26 does not. So it seems this might depend on how each RAID device was created — possibly /dev/md26 was used directly (without partitions) or it’s related to a replacement/rebuild process. Commented Nov 7 at 8:23
  • docs.redhat.com/zh-cn/documentation/red_hat_enterprise_linux/9/… (See if you can switch to the English version yourself. Awful website.) In short, make md25 a PV of the VG and repair the LVs, then remove the missing PV record at the end. (Probably take quite a long time.) Commented Nov 7 at 8:31

0

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.