Earlier in the week, I read a story online about some poor photographer soul who lost his entire photo library when his hard drive failed. Following that, I was chatting with a friend in the /Pro division at Apple Computer — he was shaking his head about how few photographers (pro and otherwise) seem to “get” the principles of backing up and storing their image libraries.
So with that in mind, I figure I’d share our studio’s backup protocol. This post is long and is directed at photographers, videographers, or creative studios and is NOT for the weak of heart. It’s not like most of the snipits I like to posts, however I think it’s important and there’s not much info out there for those of us who need powerful backup solutions. While no backup solution is perfect, we’ve worked for many years to develop a system that stands up to the rigors of our busy office, so here’s to sharing for anyone that cares about this stuff… BTW, I owe a big thanks to CreativeTechs here in Seattle for their helping us outline this program and for supporting it as well.
Now, before we get our hands dirty, a word on price and robustness: yes this solution I outline is expensive and it is robust and is meant to deal with a huge amount of data. (eg. I shot 35 thousand pictures for one job last month.) Do you have to do it this big? No. If you’re a busy commercial pro, you probably should. But if you’re growing your business, or you’re just interested in backing up your hobby snapshots, you probably don’t need all the bells and whistles or don’t have to spend yourself silly. But, what’s important here is that the fundamental protocol I’m outlining is solid and should be mimicked. And most importantly, it is scalable. If you don’t need 7TB to keep your images, use a 1TB solution, etc. If you’re low on ducket$, consider cheaper (or smaller) hardware, but the basic premises are the same. Now then…
ON SITE PORTION OF THE SOLUTION: Our studio runs a network of many computers linked together at a hub which speaks directly to a central file serving computer. This “server” can be any computer really, and in our case it is a Mac G5 tower, but it could be a Xserve or a Mac Mini–the point is that its sole job is to retrieve files for the rest of the computers on the network. This server’s external hard drive is the focus of this post. In our case, we upload all our photography raw data onto Apple’s Xserve RAID (photo atop this post). This is a giant hard drive (7TB) that writes (and retrieves) data seamlessly over 14 different drives in an array. This is fancy terminology that basically means that the drives all sync together to act like one drive, but in reality they’re separate drives arranged in such a way that if one drive fails, the server can identify it and, upon replacing the defunct drive, re-create data that was on the dead drive. It circumvents the horror of all your data living on a single bulk hard drive and failing. If that kind of drive fails, you’re toast. By spreading the data over several drives, you’re minimizing your risks. If one drive fails, you’re covered; and theoretically, multiple separate drives are far less likely to die at the same time. Redundancy is the key. NOTE: If you can afford the XServe RAID -get it, it’s sweet and is scalable. If you need to step down, look into buying or creating your own RAID array (some sources here). If that’s still to spendy, have your computer mirror (write an exact copy) to two separate external drives (like Lacie or comparable). Off the shelf software solutions to help are available for this lower end solution with a simple Google search.
OFF SITE PORTION OF THE SOLUTION: Now, the RAID takes care of any on-site single drive failures. You’re backed up at the office. But what about a fire? What if the entire building gets crushed in an earthquake, RAID and all? You need to have at least one copy of your data at a secure location off site. In our case, we purchase a unique hard drive for every job (for commercial clients, we bill them for this and they don’t mind–they thinks it’s wicked-smart how good-n-backed-up we, and they, are). We do NOT recommend DVD’s or CD’s. They are more volatile than hard disks. The data for the job gets put directly onto these individual drives and gets archived off-site. Thus, we’re backed up in case of drive failure AND in case of a dramatic catastrophe. A few smarty-pants folks out there might now be saying: “what if your array gets burned in a fire and your off-site hard disk fails?” In that case, we cry. We’re betting, as all backup systems do, that our redundancy measures will out perform most disastrous situations that occur.
OTHER DATA? Note: The above is our solution for the RAW photo data that is created in intense bursts of large piles of data (shoots), not usually a small daily trickle. All our images live in their original RAW mode (and sidecar files) on the XServe RAID (redundant) AND off site. But what about client work, adjusted image drafts, delivered images, post production in progress, invoices and all the other data that gets changed or updated on a day to day, “trickle” basis? We call this our LIVE (rhymes with hive) WORK and it’s handled in a slightly different manner. It still lives (in a separate partition) on our XServe RAID, and thus had built-in, on-site redundancy. To remedy the off-site portion of the equation we use an automated backup software called Retrospect and three (3) separate 1TB drives we call A(1-10), B(11-20), and C(21-31). Two drives live on-site at any given time, and our IT support group has configured Retrospect to write all of our live work to one of those drives each night, alternating from night to night between the two drive (thus writes everything to one drive A(1-10) on Monday, and the other drive B(11-20) on Tuesday; overwrites the first drive again A(1-10) with Wednesday’s updates, overwrites the second drive B(11–20) with Thursday’s updates, etc). Then, after each 10 day period (in our case the names of the drives correspond to the calendar dates during each month that a particular drive lives off-site), we swap out the third drive C(21-31)–previously living off site–with the next drive in sequence, the A(1-10) drive. The system continues, now writing to B(11-20) and C(21-31) on-site; and A(1-10) spends the next 10 days off-site.
Live Work Summary: What the above accomplishes is important and based on the same basic principle of redundancy that we use for storing our raw photo data. In this case, if a single RAID drive fails, we replace the particular drive in the array and continue working. If all 14 drives in the RAID fail, we are backed up with at least yesterday’s live work using one of the 1TB drives A, B, or C–whichever was written to most recently. If the office gets firebombed, then the most we’re out is 10 days worth of live work and can rebuild using whatever drive was off-site.
Whew! Is all that overkill? Maybe, but the principles behind it are not. Is it smart to be paranoid and extra safe? Probably. I hope that helps frame how important I believe it is to back up your photographs and your work. Take
some, all, none, of this info and put it to use in your studio
Again, depending on where you are in the world of photography, videography, or similar, this is scalable. Regardless of how big or important you view your collection to be, I recommend getting ahold of a tech outfit like CreativeTechs, even if it’s for just an hour to analyze your needs and make some recommendations. Treat yourself to a New Year’s present and invest in protecting your collection. Good luck!