I've had my share of disk failures over the years. And I've learned my lessons from them.
Those lessons are:
a) back up
b) back up
c) back up again and take it offsite
Or in short: if the data (substitute the RAW file, the business spreadsheet, the bitcoin wallet, ..), if that data doesn't exist three times and one of those three is somewhere far enough away, so it doesn't get fried in a fire, then said data is not safe. Trust me, I've been there.
Here's a look at my backup environment (sorry, this gets a bit techie for a moment):
In order to prevent the worst, I do a Time Machine backup to a 4-slot Drobo (2nd gen, connected via FW800), which by default runs once an hour. I then run a nightly mirror of the system drive to an external USB drive, just to be able to quickly recover from a crash on that drive. Then I have two 2TB USB drives that I use to back up my most critical data (pictures, business stuff), and I'm using CrashPlan for that. Both CrashPlan drives contain the same backup. One of those drives is always in my car, e.g. CrashPlan complains a bit about a missing drive, but backs up to the other without complaint. I'm willing to live with the complaining for the convenience and safety. Every couple of weeks I take the drive from the car, hook it up to the computer and take the other drive to the car. Over night the drive at the computer is updated to the latest status.
This way I'm protected from pretty much everything, including a fire at the studio.
So far so good. (knock on wood)
The only weak link is that the Drobo doesn't just hold backups, it also holds projects from the last 10 years. Multitrack hard disk recordings, mixes, masters, etc. - too much data to put on the external 2TB backup drives.
So these projects are the data that I would lose in case of a disaster at the studio.
Up to last week, my definition of disaster was something along the lines of fire, flood, airplane crashing into the building.. what was not within that definition was that Drobo wouldn't be able to notice bad blocks on one of the four drives.
Which is what I believe happened to me. At one point Drobo turned very very slow. It usually does that when the disks get close to being full. It's far from full, but it still is very very slow. And the 2nd gen Drobo isn't really fast to begin with.
So when it started to misbehave, I ran Disk Utility. Disk Utility found an error, tried to fix it and gave up after a while.
At that point I tried Disk Warrior, the last resort secret weapon for disk issues on the Mac. It has saved my butt several times, especially with HFS+ which seems to corrode after a while. Disk Warrior also gave up, telling me the disk had bad blocks. As Drobo presents the disk packages as one logical volume, Disk Warrior was unable to tell me which of the four disks was the bad one.
Imagine my panic, I didn't want to lose my project data, so the first thing I did was file a support request with Drobo. They asked me to send them a diagnostic file from Drobo Dashboard, which I sent to them. They examined it and then said it didn't show any hardware issues with any of the drives. No bad blocks.
Hmmmmm.. who's to believe? Drobo saying the drives are fine? Or Disk Warrior who says there are bad blocks on a drive? The symptoms of slowness are real.
Long story short, Disk Warrior support went the extra mile, even logged into my system to do some support magic, managed to run the software on Drobo without it giving up and gave me the opportunity to copy the files off the Drobo to yet another external drive which at that point I had gone out and acquired.
So here I am, at the end of a week of copying the most important files off my Drobo (4 x 2TB) onto an external USB drive (3TB).
Next step: take the drives out of the Drobo, and identify the drive that has the bad blocks, which should be easy to find, as it's noted in the S.M.A.R.T. info.
And *if* the bad blocks are in the S.M.A.R.T. data, then that's the point where Drobo will instantly turn useless to me, because it should've been able to identify the bad blocks and it failed. But we're not quite there yet, I still have to run the tests.
Let's find out if I'll have to find a better hardware solution. I'll keep you updated.