Last night, I had the unpleasant experience of having a hard drive “crash” on me. The computer turned on, but just sat there, with the ubuntu boot screen essentially stuck indefinitely. The cause wasn’t too hard to figure out – one of my hard drives had died suddenly. Unfortunately, it was the hard drive to which I’d been diligently moving all my data, preparing the computer to be given away. (Anyone want a giant tower, two monitors and a four year old computer that supports up to 8 hard drives?)
In any case, since I was able to determine that the failed drive was SATA and not IDE, the solution turned out to be a reasonably scary, but simple solution. (Btw, determining that the drive was broken consisted of unplugging the drive, and working out that the computer was able to boot without it, as it was a storage volume and did not have any system files on it.)
First, I allowed the computer to boot up to where it was stalled. At this point, I unplugged the SATA cable (NOT the power cable) connecting the drive to the mother board. The computer was then able to recognize that the boot process was stalled and was unable to mount the drive. A prompt comes on the boot screen and asked if I wanted to manually repair (R) the drive or skip mounting (S) it. Picking manual repair by pressing r, Ubuntu then drops you down to a shell prompt.
Once at the shell prompt, you can do something strange – plug the SATA cable back in. For me, the drive was recognized, but error messages flashed up. (Alas, I didn’t write them down.)
At this point, I tried a manual mount on the drive – and it stalled. 20 minutes later, the drive was not mounted, which seemed unusual. My suspicion was that the drive was being read but that the file system was misbehaving. If the drive was truly dead, it should have failed much faster than that.
The solution then seemed somewhat intuitive: running “fsck -v /dev/<drivename>”. Surprisingly, It was able to see the drive and began recovering the journaling. After about 35 minutes, the drive was restored to a readable state, and all of my data appeared to be intact.
Not that there’s a moral to this story or anything, but I’m glad I didn’t give up on this disk as it turned out to be a file system problem, not a hardware problem.
However, there were two important things I did learn: 1) SATA hot swapping actually works on desktop computers. 2) If your EXT3/EXT4 file system fails, it may be indistinguishable from a hardware failure at first glance. When in doubt, try a file system recovery – and SATA is pretty nifty!