1. Introduction
In this tutorial, we shall make a brief introduction to backup, its types, strategies, and policies. For every one of us that relies on digital information systems, one of the first questions that come to mind is: “is our data safe?” or better yet: “how safe our data is?”. In fact, those are very good questions. Many menaces endanger information systems nowadays. Hardware failures, hacker attacks (such as ransomware, logic bombs, viruses, and so on), software bugs, and usage mistakes, to name a few.
2. Backup: What Is It and Why It Is Important?
By Backup, we understand a copy of data that can be used to recover the system in the case of data losses. Indeed, as there are so many reasons for data to be lost, it is impossible to anticipate them all. So, even though there are great solutions to safeguard data, RAID, for instance, the best option is to safeguard the data by doing as many copies as necessary. Also, we can establish objectives like recovering the data as it was on a previous date.
3. Backup Objectives
Let us say that we have a payroll system that we must ensure its availability. The system receives daily changes as long as people are hired, retire, or laid off. Also, it calculates the paychecks, taxes, and so on every month. Therefore, its data is constantly changing. So, in the event of a data loss we would want, ideally that, after recovery, as soon as possible, we could be still able to manage all employees. Even the ones we hired just before the event. If that is not possible, we’ll try to minimize the need for manual data entries. That exemplifies the main concept behind backup objectives: how long it’ll take to resume normal operations and what data will be recovered.
- Recovery Point Objective (RPO): The point in time, regarding the date state, in which we’ll want the system to restart after recovering. It is an important metric in the backup system design as we’ll see
- Recovery Time Objective (RTO): Establishes the maximum downtime we can afford, measured between the failure and its recovery
- Backup Window: The longest time we’ll have to create the backup. We must note that the backup process is quite demanding on I/O, so in most cases, we might not do them during work hours
- Data Retention: Defines how far in the past we may need to get data back
As we can see, the above metrics are very akin to the actual business needs. Also, they directly affect the backup system sizing, cost, and viability. Furthermore, they can differ a lot for different domains, for instance, tax data needs to be available as long as the revenue service may require it for auditing purposes, usually for a few years. On the other hand, employee information may be needed for decades after the employment contract has ended. And to make things worse, each country’s legal requirements for multinational companies will lead to different objectives for similar data. If the data is not country partitioned, we shall adapt the policy to the most demanding goals such as having all Recovery Point Objectives required for any situation, with the shorter Recovery Time Objective and using the longest Data Retention.
4. Backup Types
Now, let’s review some of the major classifications for backups.
4.1. Backup Types
First, regarding the recovery point objectives we may have:
- Point-in-time backup: the Recovery Point Objective is at specific dates or occasions. For instance, daily, weekly, monthly, or after each payroll processing. Some data loss is even acceptable. In this case, the cost of recreating the transactions should be lesser than having a continuous backup. Moreover, the backup data can be easily moved to offline media and off-site storage
- Continuous backup: the data is continuously backed up. Minimal or no data loss is acceptable. The main issue here is that the backup storage must be always online. And that increases costs and the risk of incidents that may corrupt the backup data. Ransomware attacks, for instance, might affect both copies of data
- Hybrid: that is the usual and recommended case, we combine a continuous backup, to minimize data loss, with an auxiliary point-in-time backup system so that, if everything else fails, we’ll be always able to recover most or all the data. For instance, if ransomware destroyed critical data, we can “go back in time” and recover most of it
4.2. Backup Strategies
Now, regarding backup strategies, we can also choose from:
- Full backup: contains all data, takes longer to create, and is the easiest to recover
- Differential backup: contains the data that changed from the last full backup. As time passes from the full backup, it grows larger. And it takes longer than the full backup to restore. We must restore the full backup and a single differential backup afterward
- Differential backup: contains the data that changed from the last backup, full or differential. It is the quickest to create, and the longest to recover. The recovery needs the last full backup, the last differential (if exists), and all the incremental up to the recovery point objective
The figure shows how they differ from each other:
In practice, we most likely will create a policy that uses mixed strategies to ensure the needed Recovery Point Objectives are achievable in the Backup Window. Also that the backup recovery can be done within the required Recovery Time Objectives.
5. Designing a Backup Policy
A backup policy is a document that defines, for a set of systems, how its data will be protected, taking into account its specific objectives. It should cover:
- Backup schedule
- Backup strategies
- Data sources and backup scope
- Procedures for backup, full and partial recovery
- Roles and responsibilities
- Legal requirements applicable
- Backup objectives
- Media storage, including off-site
6. Designing a Backup Solution
The key considerations to correctly size a backup solution are the amount of total data it must store, how much the data changes throughout time, the backup window it must operate, and the recovery time objectives. Those metrics will help establish the media types, the throughput, and the system’s total capacity. Regarding the total storage, to keep costs a little lower, most solutions are able to do data compression and data deduplication. For external backup solutions, cloud-based for instance, we must also consider using cryptography.
7. Conclusion
The need for a reliable backup system nowadays is quite pressing. In fact, with the multiple risks and threats we face, any measure we take may not be just enough. However, with proper backup policies, we can anticipate incidents and be prepared to quickly respond and mitigate them by recovering the affected systems. In this tutorial, we discussed one of the most valuable tools to help us in our system’s overall resilience. Backup is the ultimate resource at hand to prevent major damages and catastrophic data losses and failures.