0

I have an ssd that has had a hardware failure of some kind, I can rma it as it is still under warranty but their service does not include data recovery. I have collected my work files from it but my personal files and plugins are still there and I would like to recover them.

The catch... Bandwidth and total copy size seams to be an issue. Read too much and the drive crashes. Reports bad sectors everywhere (false) or shuts down the whole os. So I have since put the drive in an external case so I can hot swap it as I suspect the memory buffer is the issue. I thought of using rsync with a bandwidth limit as per another question here, but I believe I would need to stagger the copy process to either let the buffer clear itself or cool down.

I am in need of some script or tool to recover my lost data.

1
  • Just use ddrescue it can resume and has plenty of other options that might aid you on your quest (including a max rate so you can test that theory)... not much else you can do, good luck Commented Jun 12, 2022 at 12:02

1 Answer 1

2

Reports bad sectors everywhere (false)

"sectors" is not a thing that exists on SSDs. "blocks" do, and if your drive reports them as bad that means, because there's nothing to fail mechanically:

  1. Asserted the address lines to get these bits of the block
  2. Read out the memory cells, which means "gotten a vector of voltage readoutss"
  3. Tried to convert these to bits, by applying a rather complicated soft-input error correcting code on them
  4. Return (success and data) or (error):
    1. Success when trying to decode them (iteratively) yielded an error term ("syndrome" in decoder speak) that was zero at some point
    2. Error when the soft values never actually can be massaged and error corrected to yield a word that has no error

So, you get an error. That means the thing cannot be read out. There's no "the SSD is wrong about the data being unrecoverable": It always reads some voltages, no matter how broken everything is, and checks whether these pass a check, and corrects them, if necessary, and possible.

Therefore:

Reports bad sectors everywhere

You'll have to trust your SSD on that – it literally cannot read the data. The only thing you can change to "globally" make reading hard although the memory cells (which, by the way, are small capacitors charged to some voltage) are intact is if you "shift" the reference voltage of the ADC that converts the analog voltages to digital values. Then, you get incorrect soft input for your decoder, even if the actual memory was OK.

But that voltage is generated within the very same die as the ADC is (so, within your flash memory chip) and should be quite resilient to changes in e.g. supply voltage.

So, maybe it's a thermal thing, really, or some silicon mode failure.

Either way, it would sound to me like you would not want to use rsync, or any file-system level tool, to get data from the drive. That requires the operating system to be able to access the same data spots very frequently, just to understand what data is in which file.

What you'd need to do is make block-device level copy, and to it in small steps. Read (for example) 16 MB into an image file. Wait a while. Read 16 MB… and so on.

This could be done in a ZSH/bash shell script with a loop that reads these blocks sequentially with dd, then calls sleep to wait, then reads the next and such, or in a few lines of Python. Don't forget to check for errors reading on the way, and abort the procedure when they happen, to restart it at the same point later.

Because an actual prepared solution is asked for:

#!/usr/bin/zsh # Copyright 2022 Marcus Müller # SPDX-License-Identifier: BSD-3-Clause # Find the license text under https://spdx.org/licenses/BSD-3-Clause.html # THIS SCRIPT IS UNTESTED AND COMES WITH NO WARRANTIES, FOLKS. IN_DEVICE=/dev/yoursource_ssd BACKUP_IMG=myimage LOGFILE=broken_mbs.txt # get size, round up to full MB size_in_bytes=$(blockdev "${IN_DEVICE}") size_in_MB=$(( ( ${size_in_bytes} + 2**20 - 1) / 2**20 )) #check whether size > 0 if [[ ! ${size_in_MB} -gt 0 ]]; then logger -p user.crit "Nope, can't determine size of ${IN_DEVICE}. I'm outta here." echo "Failure on input" >&2 exit -1 else logger -p user.info "Trying to back up ${IN_DEVICE}, size ${size_in_MB} MB" fi if fallocate -l "${size_in_MB}MiB" "${BACKUP_IMG}" ; then logger -p user.info "preallocated ${BACKUP_IMG}" else logger -p user.crit "failed to preallocate ${BACKUP_IMG}" echo "failure on output" >&2 exit -2 fi failcounter=0 MB=$((2**20)) for i in {0..$((${size_in_MB}-1))}; do if \ dd \ "if=${IN_DEVICE}" \ "of=${BACKUP_IMG}" \ "ibs=${MB}" "obs=${MB}" \ "skip=${i}" "seek=${i}" ; \ then echo "backed up MB nr. ${i}" else failcounter=$(( ${failcounter} + 1 )) echo "${failcounter}. error: couldn't backup MB nr. $i" > &2 echo "${i}" >> ${LOGFILE} logger -p user.err "couldn't backup MB nr. $i" fi sleep 0.5 done echo "Got ${failcounter} failures" exit ${failcounter} 
7
  • Thank you for the insight, I did not have my computer with me as I had to type it from memory. I can explore the hdd on Linux as the original was Windows. But any major operations does kill the device, even tried a live clonezilla with a rescue flag didn't help, it actually broke clonezilla. Commented Jun 7, 2022 at 8:06
  • That being said, I am not a python coder and do not know where to get started on this type of operation. I have bought a backup ssd to copy the files onto but for coding, I'm more of a Windows/c#/js programmer so this level of operation and technical side I only know in theory Commented Jun 7, 2022 at 8:09
  • I can't accept this answer as it doesn't resolve my issue for data recovery, only offering insight as to the cause of the hardware fault. Commented Jun 11, 2022 at 3:40
  • hey, I kind of tried to address that in the last three paragraphs of my answer. Commented Jun 11, 2022 at 10:39
  • I understand what the concept is, I know what python is but not every linux users knows how to program in Python or knows how to do this kind of thing. A solution is needed, not simply just re-iterating my initial assumption of staggered copying. as mentioned, I did attempt with clonezilla for a block copy with rescue as an indication of what solutions I am able to do. Commented Jun 12, 2022 at 2:28

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.