Backup Tedium Part III: Planning the Strategy

Here we are then, back by popular demand for the final installment of Backup Tedium. It’s time to nail our colours to the mast and work out a plan that will protect us, come what may.

So, last time, we took an overview of the possible risks. This time, we’re going to categorise those risks in more depth and put appropriate protections in place for them.

Does it matter?

We’re going to look at all eventualities and all the respective protection measures here. However, for each different situation, we need to consider not just the risks but the consequence of the loss. So first, we need to look at the value of our data and the consequences to us of losing it.

Let’s take an example - sample libraries. Commonly, you might expect to have several terabytes of sample libraries that you call on from time to time. They’re quite possibly on one or two drives, that are as susceptible to failure as any other. What happens if those drives die?

Have you lost all your libraries forever? Pretty unlikely. You can probably download the majority again. There may be a few outliers that are harder to track down, but it’s unlikely anything super-critical is lost forever.

Would that reconstruction process be a pleasurable experience? Of course not.

Even if you do have a backup, are you just going to be able to plug it in and carry on where you left off? No. You’ll doubtless have to go through the whole re-linking dance in its various guises before you can be back to square one.

So, in this instance, it’s sensible to have a backup - it will save you a few download hours - but it’s not the end of the world if you don’t have it, or if it’s not quite up-to-date. Perhaps, then, you might choose simply to have a spare copy of your library somewhere, maybe on a cheaper backup drive, that you can move to a faster drive should you need it.

We all tend to blindfold ourselves when it comes to our data and assume that our worlds will implode if we lose any of it. Before you start on your backup strategy, it’s a really good idea to think about what would actually happen if you lost your data. In most situations it isn’t as bad as you thought it would be.

Back to the 3-2-1?

Well, not quite, but we look at a three-tier strategy. Primary, secondary and tertiary data storage. It’s not as scary as it sounds…

Primary Store

This is the storage that you use day-to-day. The data you’re trying to protect. You’re probably constantly accessing it and putting it at risk - silly you. Typically, it is the drives in your computer or the external SSDs that you use for project and sample storage. It might also include your archive, if you have such a thing.

Secondary Store

This is the level of backup that many of you probably already have in place, though, if you’re like the rest of the world, you’re probably always thinking that you ought to update it one day… This applies to any secondary local copy of your data - i.e. in the same building, or the same site.

Tertiary Store

The tertiary store is the new kid on the block for most of us. In times past, this would have been the tape that you took home with you in the evening, but now it’s more commonly some kind of cloud storage or cloud-accessed facility.

Do you need all three? Not necessarily. Again, it depends on what you’re doing.

If we were talking about word-processing documents, you might reasonably feel that, if you are synchronising your local documents with Dropbox or Google Drive or suchlike, you don’t really need a secondary store. How likely is it that Dropbox will lose your data? And even if that did happen, what are the chances of you losing your primary storage (your local copies) at the same time? Are you willing to take that risk for the sake of your documents?

One of the key things to remember about your tertiary store, in current times at least, is its accessibility. Many of you will be working with multiple-gigabyte projects. Firstly, how long will it take you to upload your terabytes of projects in the first place? What happens if you suffer a loss before that process is complete? Then, what happens if you lose your entire primary store? Do you have time to wait for it all to download?

Beyond Duplication

A proper data strategy relies on more than just duplicates of your data. These days, technology has some less sledgehammer-like solutions for us:-

RAID

This probably deserves an article all of its own but, briefly, RAID allows us to use an array of drives together in order to limit the consequences of a single drive failure. It stands for Redundant Array of Inexpensive Disks and, although there are multiple different flavours now, the principle is broadly the same. Typically one or two of the drives you use is dedicated to redundancy. That is, it keeps track of the data on all the other drives so that, if one (or sometimes two) of them fails, it can rebuild that data onto a new drive and you can carry on working. Typically, RAID systems are not suitable for primary storage (especially for audio), so you might expect to use a RAID system as part of your secondary or tertiary store.

Snapshots

With many of the new-generation intelligent RAID systems (such as Synology, which we use a lot), there is a system of snapshotting, whereby the system keeps a snapshot of the position of your data on a regular basis. This is not a duplicate of your data, but a recipe to put the data back to the way it was at a particular point in time, so it doesn’t take up a lot of space. This is probably one of the most common quick fixes to the “I really wish I hadn’t done that” scenarios.

Synchronised stores

One of our most popular solutions today is synchronised volumes. These allow you to work with your primary storage, while it is being synchronised with your secondary and tertiary storage without any user intervention. Many of our clients have multiple studios and can set up primary stores in each (which effectively double as tertiary stores for one another) to allow them to work from current projects in either location. Very shortly we will be launching a new solution in this area, which will allow you to have a synchronised tertiary store using a cheap drive and a Raspberry Pi!

The Strategy

So, with all of that in mind, here’s our template for planning your strategy:-

Risk

Potential Loss

Strategy

External Forces

Local Disaster - Fire / Flood

All local data stores

Tertiary data store

Widespread Disaster - Accident / Terrorism

All data stores

Multiple data stores - possibly no effective strategy

Theft / Malicious Damage

All local data stores

Secondary / tertiary data stores

Human Error

Accidental Deletion

Specific, identifiable local data

Secondary / tertiary data stores / snapshots

Regrettable Changes

Specific, identifiable local data

Secondary / tertiary snapshots

Accidental Hardware Damage

Primary data store

Secondary / tertiary data stores

Hardware Failure

Drive Failure

Primary or secondary data store

Primary / secondary / tertiary data stores / RAID

Computer Failure

Primary or secondary data store

Primary / secondary / tertiary data stores

Power Failure / Anomaly

Primary or secondary data store

Primary / secondary / tertiary data stores / UPS

Software / Service Failure

Filesystem / Metadata Corruption

All data stores

Error-checking filesystem

File Corruption

All synchronised data

Secondary / tertiary snapshots

Mis-Synchronisation

All synchronised data

Secondary / tertiary snapshots

Cloud Storage Failure

Tertiary data store

Primary / secondary data stores

Backup Software Failure / Demise

Secondary / Tertiary data stores

Multiple backup / synchronisation models

…and a couple of examples of how we might put the principles into action:-

Fig 1: An example of a high-specification remote-replicated data management system

Fig 2: An example of a lower-specification remote-replicated data management system

Don’t Have Nightmares

Whatever happened to Nick Ross?

Well, we’ve come to the end of this series now - you may be very pleased to hear so. As I said when we started, it’s not the most exciting of topics, at least not until the worst has happened and, let’s face it, that’s probably not the kind of excitement any of us are looking for.

A proper backup strategy takes a lot of planning and a reasonable amount of investment, which will put many people off, but it’s better to put something in place rather than nothing. So decide what’s really critical now and do something to protect it from the more plausible risks.

Of course, if you just can’t face the tedium, we’re here to help!