Reply to post: Re: At DEC's headquarters in Maynard

Vibrating walls shafted servers at a time the SUN couldn't shine

Anonymous Coward
Anonymous Coward

Re: At DEC's headquarters in Maynard

As another poster said, RAID is an excellent, but not 100% perfect, way to deal with drive failures. Good RAID implementations that I have worked on do the following to try to address this issue:

1. Choose drives in different chasses (e.g. JBODs) for a RAID set. A mechanically failing drive's vibrations are far more likely to impact other drives in the same chassis than in the entire cabinet. That way even if all the drives in the chassis fail, no single RAID set is unrecoverable.

2. Power down failed drives.

3. Drives rarely (never say never when it comes to the ways in which drives can fail :-) fail mechanically without some forewarning. Mechanical failures often start with drive access time increasing (takes longer for the heads to settle) and software can detect this and take corrective action before things get really bad.

In the main article, I'm surprised the memory failures weren't more obvious, I would have thought the system would be warning of memory ECC errors (assuming by "RAM" they meant DRAM in main memory), some correctable errors and then an uncorrectable that brings down the system.

POST COMMENT House rules

Not a member of The Register? Create a new account here.

  • Enter your comment

  • Add an icon

Anonymous cowards cannot choose their icon