Thursday, October 29, 2009

A heroic way to die... for a computer

If you have a lengthy enough history with computers, you probably saw a computer's hard drive die at least once. On a Windows computer, you'll most likely see a BSOD, then upon restarting, your computer tells you that it doesn't think your hard drive exists.

So what would a Linux computer do when a similar disaster strikes? Well, you probably wouldn't know for a while unless the computer is actively monitored. That's what happened to me. I had a laptop computer running Ubuntu 6.06 (yes the old trusted Dapper) Server. The first sign of trouble was that DHCP clients were no longer getting IP addresses from it, which served as a DHCP3 server. It took me a few moments to figure out what was going on. The tell-tale sign was a console message like this "Journal writing failed... Abort... root remounted read-only". Well, that's the machine's way of telling me "well, I'm not seeing the disk anymore, so I'm just going to treat it as read-only and move on". The machine was still running, and it still did some other duties fine (like routing stuff between two subnets). But many commands no longer ran with "I/O error", including the shutdown. Once I restarted it with the power button, I saw the familiar "primary disk not found" message.

I was thoroughly impressed. Imaging a warrior having his legs cut off and even his ground yanked away from under him, but he keeps on fighting? That's what the O/S did. Maybe I'm naive - hey, I was Windows only one time. :)

BTW, that Dapper was running on LVM.

