Friday, March 29, 2024

RAID - Data Recovery with Open-Source Tools (part 7)

This is part 7 of a multi-part series. See part 1 for the beginning of the series.

Software RAID

It's becoming increasingly common (in 2009) on desktop PCs to use some form of BIOS-based software RAID. In most cases dealing with a single-drive failure in a software RAID isn't terribly difficult. For example, with NVIDIA's software RAID, even when one drive out of a stripe (RAID 0) set fails, if the drive is recoverable, you can simply clone it to a new identically-sized drive and the RAID will just work. Unfortunately, this isn't so simple with Intel's software RAID, which appears to store the serial numbers of the drives in the RAID metadata, meaning an exact clone won't work. While it would most likely be possible to simply edit the RAID metadata using hexedit to update the drive information, a somewhat simpler solution is to make a backup clone of the drives in the array, then re-create the RAID exactly as it was before in the RAID BIOS, then boot into Linux and run testdisk on the RAID device. More on that in part 8.

Most often the RAID metadata for drives in a software RAID volume is stored toward the end of the drive. In some cases, if you are forced to clone a failing RAID drive to a larger drive, you can make Linux (and maybe the BIOS and Windows) see the drive as a RAID device by copying the last few blocks from the failing drive to the last few blocks of the replacement drive.

old_end=$(( $( blockdev --getsz /dev/sda ) / 2 ))
end=$(( $( blockdev --getsz /dev/sdb ) / 2 ))
dd_rescue -d -D -b 4k -B 4k -s $(( $old_end - 1024 ))k -S $(( $end - 1024 ))k /dev/sdb /dev/sdb

Hardware RAID

Unfortunately, the ways that hardware RAID controllers store metadata don't tend to be quite as predictable as software RAID. If you attach a hardware RAID member drive to a non-RAID controller, some of the tricks mentioned above might work, but there are by no means any guarantees.

Also be aware that hardware RAID controllers are very likely to take a drive offline at the first sign of an error rather than report back the error and continue as most non-RAID controllers would. While this makes hardware RAID controllers largely unusable for data recovery, it does mean that a failing RAID member drive is quite likely to be recoverable.

To be continued in part 8.

No comments: