Their systems involved shipping a server (effectively an appliance) to the customer with all of the working components on it. However, there was no build or deployment process for these components - so the only way to create a new server was to take an existing one and create a copy.
This was done by opening up a working server running with RAID 1, removing one of the disks and installing the disk into a new server. Let the RAID recover the data onto the other blank disk then remove it and put the other blank disk in and let it rebuild.... result, a copied server!
It is amazing how even fairly technically-savvy people get sucked into the "RAID=backup" mentality. This story (in the above link) ended up costing the business owner tens of thousands of dollars.
Duds are hardware that goes bad, like a disk drive, network adapter, NAS, or server. There are an infinite number of ways and combinations things can break in a moderate sized IT shop. How much money / effort are you willing to spend to make sure your weekend isn't ruined by a failed drive?
Floods are catastrophic events, not limited to acts of God. Your datacenter goes bankrupt and drops offline, not letting you access your servers. Fire sprinklers go off in your server room. Do you have a recent copy of your data somewhere else?
Bud is an accident-prone user. He accidentally deleted some files... the accounting files... three weeks ago. Or he downloaded a virus which has slowly been corrupting files on the fileserver. Or Bud's a sysadmin who ran a script meant for the dev server on the production database. How can we get that data back in place quickly before the yelling and firing begins?
There are more possible scenarios (hackers, thieves, auditors, the FBI), but if you're thinking about Dud, Flood, & Bud, you're in better shape than most people are.
Backup and Disaster recovery strategies seem really easy until you think through all the failure modes and realize the old axiom "You don't know what you don't know" is there to make your life full of pain and suffering.
Years ago my customers would literally restore their entire environments onto new metal to verify they had a working disaster recovery plan. Today most clients think having a "cloud backup" is awesome.. Until they realize in the moment of disaster that they are missing little things like license keys for software, network settings, passwords to local admin on windows boxes etc.
This is a feature of Oracle, the redo logs are replicated to the standbys as normal, so you have an up to date copy of them on the standby, but only applied after an x hour delay. You can roll the standby forward to any intervening point in time and open it read-only to copy data out.
Less need of it these days with Flashback, of course, but it saved a lot of bacon.
In those same 15+ years, mostly working for startups, there have been numerous drive failures. Unfortunately, failure (a) to verify backups before there's a failure, and (b) to practice restoring from backups has often meant that a drive failure means loss of several days' worth of work. In one instance, the VCS admin corrupted the entire repo, there were no backups, that admin was shown the door, and we had to restart from "commit 0" with code pieced together from engineers' individual workstations. That was when I got religious about making & testing backups for my work and the systems I was responsible for...
Not to say that it's the best solution for everyone, but simply that it leaves people no excuse for doing nothing.
http://en.wikipedia.org/wiki/Experience_good
Meaning that even while you're using it, you have no idea if it works.
My contention is that it's not a raid array if it can silently stop being redundant without telling you.
At best it's an Possibly Redundant Array of Inexpensive Disks.
(The below is how my comment first read.)
(sarcastic) Yeah, it's only prudent to grab a drive out from time to time and make a surprise inspection of whether it's actually filled up a full 4/5th of the way (or whatever) with the actual data the volume is supposed to contain! And the remaining fifth had better look a damn sight like parity information!
Seriously though, a controller that fails like this isn't a RAID controller, since what separates it from a paper plate and a cardboard box. On the paper plate you write "RAID controller" and tape it to an already attached hard drive, and you put the remaining members of the redundant array into the cardboard box. No setup or even connection required!
seriously seriously though, what you're suggesting is unacceptable. that's not a raid controller, that's a scam.
My current setup goes as follows:
Servers in colocation get backup daily to a server in the office. That server in the office then gets backed up daily to a iosafe.com fire and water proof hard drive in the office which when I get a chance will be bolted to the desk for further security. Clones are then made of that server biweekly (which are bootable) and one is kept in the office and one is taken offsite.
So the office server is the offsite for the colo server and the clone of that is the backup for the office.
The clones allow you to test the backup (hook it up and it boots basically).
Added: Geographically the office is about 3 miles from where the backup of the office is kept. But the office is about 40 miles from where the colo servers are kept.
So: back up your data.
If I ever heard an SA working for me advocate that position, I would probably get them off of my team ASAP.
You still want off-site backups as well of course, in case of something more extreme, but they're usually going to be slower to recover from than nearby backups.
Even if they don't fail simultaneously, the mirror drive may fail (or even more likely) have read errors or flipped bits that will corrupt the restore or render it impossible.
Personally, I don't place much trust in any RAID configuration other than RAIDZ2 (ZFS; you can lose two drives and still recover all your data; every block is checksummed to avoid reading or restoring corrupted data).
But even ZFS can't protect you against accidental deletion, fire, theft, or earthquake.
You just have to structure your redundancy to survive multiple threat models.
In which case, the redundancy offered by RAID alone is grossly insufficient.