Downtime due to disk failure
News, Technical ·Saturday November 10, 2012 @ 03:12 EST (link)
The server's primary disk, a 500GB Seagate SATA drive, crashed hard around October 24. The kernel image loaded but the main partition would not mount. Attempts to copy the disk to an image to work on it were continuing to time out. I searched around and found a local data recovery company, DTI Data Recovery that seemed to know what they were doing (I shopped around a little and another one I did talk to for a while gave me no confidence at all).
DTI Data occupies offices squeezed into a side tunnel in a strip mall in South Pasadena, about 20 minutes drive from home/work. It's convenient that they're that close, though; turnaround time by mail, which seems to be usual for many of their customers across the country, would be much worse. We brought the troubled drive in on October 26, and were quoted two prices: $495 if data could be retrieved without opening up the disk, and $895 if it had to be opened. The second one turned out to be necessary.
The first data I got back—on a 1TB Seagate drive I provided—was a real mess, and it only had /home
on it. Directories and files were scattered all over the place; for example, photos were in my mail directory at random locations; at first I thought the drive was just that scrambled, and figured on reorganizing them back based on file timestamps and EXIF data. Much more data—they said all of it—was available, and apparently there had been an internal miscommunication. The paperwork required filling out vital directories, so I listed several, with /home
first (and /var
, /etc
, and others, and then "everything you can find"), and someone apparently translated the list as "he only wants /home
". DTI Data was also not very good about communicating with me; I generally had to call them to get any updates; once I was specifically promised a call later that day, but then the recovery ended up taking until Monday, and another time a call back in a half hour, but nothing for three hours until I called them.
They did recover some more data the next time, but there was still a lot scrambled and missing. Was the disk really that bad? I eventually asked for an image—a block-by-block copy of the disk, rather than recovered files, which they gave me, as with the recovered files, on a Toshiba 1TB USB drive. They warned me it wouldn't mount, but I had figured on that.
It turned out it did mount with the most basic data recovery due diligence: use the second superblock (short but key drive information, which is fortunately duplicated in every group, of which there are about 4000). First I had played around with it a little using libext2fs (from e2fsprogs) (old but relevant PDF documentation), skipping to the second superblock in the ext2fs_open
call, reading the root inode, etc., and figuring the disk wasn't in completely terrible shape. I ran dumpe2fs
to dump the superblock and group descriptors for reference, and played around a little with debugfs
. I couldn't mount the image as ext3, which it was, because the journal was corrupt (even with noload
), but mounting it as ext2 and skipping to the second superblock (read-only, of course) did the trick and I was able to access most of the files. (At this point I was working on a copy of the image on another 1TB Seagate SATA drive I had bought: copied using rsync, it took 4h21m to copy the 500GB image file from the USB drive.)
In particular, files that DTI had missed completely or scrambled into the strangest locations were plainly accessible; I worry about the recovery software they're using. I ran a find . -type d
from /
and got a basic idea of what inodes were bad (at this point after considerable investigation I believe damage is concentrated in a couple blocks worth of inodes (32 per 4K block). This is painful for the files that are lost, but then, recovering orphans is just mark-and-sweep (figuring out which goes where is tricker).
By this time I had set up a new Gentoo install on yet another 1TB drive and had it running well enough to provide NAT (local networking) services (and access it from my laptop wirelessly). But there was a long road to go to restoring web and mail even if most of the files were there.
For directories—most of which were either caches, which don't matter, or mail files, which do, since the .
entry at the start points back to the directory's own inode, tracking down most was easy (although an entire scan of the disk, even just reading the first 24 bytes of each block, takes 43 minutes); I also logged children of the bad inodes (matching the ..
inode). I ran another scan using a heuristic and then boost::regex
(g++'s regex
support is terrible) to find blocks that looked like Maildir directories (only a few of the missing directories were more than one block, and not all of their block entries were bad). From there it remains to determine the known good blocks, subtract them from those found, and figure out which directory the found blocks belong to (order doesn't really matter; filenames do not cross block boundaries—good design there, Rémy/Theodore).
Services are gradually coming back up: the Apache servers (dynamic and static) for davidrobins.net were first, with i4031.net, my resume, internal address list (needed some recovery), SSL (likewise), my Voluntaryist Wiki, and Code Visions. The next big one to tackle is mail, which involves several programs and their config files (which all need to be copied over or merged appropriately from the old mounted image): Qmail (and Fastforward), Courier-IMAP, Maildrop, Mutt, and Horde/IMP (for Honey). The work continues.
Books finished: Smart and Gets Things Done Joel Spolsky's Concise Guide To Finding the Best Technical Talent, Peopleware.