Skip to main content
edited title
Link
barrymac
  • 1.2k
  • 2
  • 12
  • 19

Can zfs resilvering destroy data with multiplemisreported device failures?

edited title
Source Link
barrymac
  • 1.2k
  • 2
  • 12
  • 19

does Can zfs resilvering destroy data with multiple device failures?

I have had a situation whereby inwhere I am moving data into a new zfs raidz pool with four devices and, some of them virtual to facilitate the migration. The system completely crashedhung in the middle of a device replace of a file based device to a physical device.

The system did not even respond to SysRq and had to be reset physically. When it came back online then zfs had decided that only 2 out of 4 devices were online and started resilvering and reporting loads of errors. I didn't know how to stop it doing this, it keeps going in the backround even when the pool is unmounted.

By the time I managed to get the totally ok missing device online it has reported many many errors.

Does that mean that zfs has destroyed data while resilvering due to the missing device? Or can it now resilver correctly back again now that it has it's original devices in place?

When it was resilvering with only 2 devices then it was resilvering on sda3 below:

 NAME STATE READ WRITE CKSUM zfs_raid DEGRADED 0 0 38.5K raidz1-0 DEGRADED 0 0 129K sda3 ONLINE 0 0 0 sdc2 ONLINE 0 0 0 replacing-2 DEGRADED 0 0 3 /zfs_jbod2/zfs_raid/zfs.1 OFFLINE 0 0 0 sdb1 ONLINE 0 0 0 (resilvering) /zfs_jbod/zfs_raid/zfs.2 ONLINE 0 0 0 (resilvering) 

errors: 25852 data errors, use '-v' for a list

does zfs resilvering destroy data?

I have had a situation whereby in moving data into a new zfs raidz pool with four devices and the system completely crashed in the middle of a device replace of a file based device to a physical device.

The system did not even respond to SysRq and had to be reset physically. When it came back online then zfs had decided that only 2 out of 4 devices were online and started resilvering and reporting loads of errors. I didn't know how to stop it doing this, it keeps going in the backround even when the pool is unmounted.

By the time I managed to get the totally ok missing device online it has reported many many errors.

Does that mean that zfs has destroyed data while resilvering due to the missing device? Or can it now resilver correctly back again now that it has it's original devices in place?

When it was resilvering with only 2 devices then it was resilvering on sda3 below:

 NAME STATE READ WRITE CKSUM zfs_raid DEGRADED 0 0 38.5K raidz1-0 DEGRADED 0 0 129K sda3 ONLINE 0 0 0 sdc2 ONLINE 0 0 0 replacing-2 DEGRADED 0 0 3 /zfs_jbod2/zfs_raid/zfs.1 OFFLINE 0 0 0 sdb1 ONLINE 0 0 0 (resilvering) /zfs_jbod/zfs_raid/zfs.2 ONLINE 0 0 0 (resilvering) 

errors: 25852 data errors, use '-v' for a list

Can zfs resilvering destroy data with multiple device failures?

I have had a situation where I am moving data into a new zfs raidz pool with four devices, some of them virtual to facilitate the migration. The system completely hung in the middle of a device replace of a file based device to a physical device.

The system did not even respond to SysRq and had to be reset physically. When it came back online then zfs had decided that only 2 out of 4 devices were online and started resilvering and reporting loads of errors. I didn't know how to stop it doing this, it keeps going in the backround even when the pool is unmounted.

By the time I managed to get the totally ok missing device online it has reported many many errors.

Does that mean that zfs has destroyed data while resilvering due to the missing device? Or can it now resilver correctly back again now that it has it's original devices in place?

When it was resilvering with only 2 devices then it was resilvering on sda3 below:

 NAME STATE READ WRITE CKSUM zfs_raid DEGRADED 0 0 38.5K raidz1-0 DEGRADED 0 0 129K sda3 ONLINE 0 0 0 sdc2 ONLINE 0 0 0 replacing-2 DEGRADED 0 0 3 /zfs_jbod2/zfs_raid/zfs.1 OFFLINE 0 0 0 sdb1 ONLINE 0 0 0 (resilvering) /zfs_jbod/zfs_raid/zfs.2 ONLINE 0 0 0 (resilvering) 

errors: 25852 data errors, use '-v' for a list

Source Link
barrymac
  • 1.2k
  • 2
  • 12
  • 19

does zfs resilvering destroy data?

I have had a situation whereby in moving data into a new zfs raidz pool with four devices and the system completely crashed in the middle of a device replace of a file based device to a physical device.

The system did not even respond to SysRq and had to be reset physically. When it came back online then zfs had decided that only 2 out of 4 devices were online and started resilvering and reporting loads of errors. I didn't know how to stop it doing this, it keeps going in the backround even when the pool is unmounted.

By the time I managed to get the totally ok missing device online it has reported many many errors.

Does that mean that zfs has destroyed data while resilvering due to the missing device? Or can it now resilver correctly back again now that it has it's original devices in place?

When it was resilvering with only 2 devices then it was resilvering on sda3 below:

 NAME STATE READ WRITE CKSUM zfs_raid DEGRADED 0 0 38.5K raidz1-0 DEGRADED 0 0 129K sda3 ONLINE 0 0 0 sdc2 ONLINE 0 0 0 replacing-2 DEGRADED 0 0 3 /zfs_jbod2/zfs_raid/zfs.1 OFFLINE 0 0 0 sdb1 ONLINE 0 0 0 (resilvering) /zfs_jbod/zfs_raid/zfs.2 ONLINE 0 0 0 (resilvering) 

errors: 25852 data errors, use '-v' for a list