Why can’t I use a snapshot instead of a backup?

por-que-no-debo-utilizar-copias-instantaneas-snapshots-en-lugar-de-backup-big

In recent years, snapshots (also known as checkpoints) of Virtual Machines have gained popularity as an alternative to backups. Snapshots allow a system to be rolled back to a previous state without handling memory drives or needing to wait for restoration to complete. Despite these advantages, however, checkpoints are not a true alternative to backups. In fact, there are several major drawbacks to snapshots that IT professionals should seriously consider. But first let’s quickly define the difference between checkpoints and snapshots.

What’s the difference between a checkpoint and a snapshot?

The short answer is that there is no difference technically speaking. They are actually the same thing. The term snapshot became popular with the VMware vSphere platform, which is the virtualization platform that many IT professionals started with. Hyper-V continued to use the term for some time, but eventually the alternative term checkpoint was adopted. That is, the two terms refer to the same thing, but each applies to different platforms. VMware = Snapshot Hyper-V = Checkpoint

When should we use snapshots?

Checkpoints allow a system to be rolled back to a previous state almost instantly. The reason why a snapshot can do this is that, unlike a backup, a snapshot does not make a copy of your data.

But that doesn’t mean we should avoid using checkpoints at all. Checkpoints are still useful. Checkpoints are very effective in protecting a virtual machine prior to a configuration change. If a configuration change were to cause problems for a virtual machine, then a snapshot offers an effective means for undoing the change.

Checkpoints can also be useful for performing software updates or upgrades. If, for example, an operating system upgrade were to leave a virtual machine in an unbootable state, a snapshot will allow you to revert the virtual machine’s operating system to its pre-upgrade state. The same basic concept also applies to software upgrades and to patch installations.

The anatomy of a snapshot

In order to understand why snapshots are not an alternative to backups, it is necessary to understand how they work. There are different types of checkpoints, but for our purposes, let’s discuss how they work in Microsoft’s Hyper-V.

The vast majority of Hyper-V virtual machines use one or more virtual hard disks. A virtual hard disk is simply a VHD or VHDX file that acts like a hard disk for a virtual machine. Like a physical hard disk, a virtual hard disk file can contain volumes, file systems, and of course, files. Under normal circumstances, a virtual hard disk file is read/writable, meaning the virtual machine can write data to and read data from the virtual hard disk. Although this may seem obvious, let’s see why it is important.

When an administrator creates a checkpoint for a Hyper-V virtual machine, it does not make a backup copy of the virtual hard disk file. What Hyper-V does is put the virtual hard disk into a read-only state. Because the virtual hard disk is now read-only, Hyper-V creates a differencing disk that becomes a part of the virtual machine. This differencing disk is essentially a virtual hard disk file that has a parent/child relationship with the virtual machine’s original virtual hard disk file. A diagram of this configuration is shown below.

Since the virtual machine’s original virtual hard disk is now read-only, all write operations are directed to the differencing disk. This ensures the integrity of the contents of the original virtual hard disk file.

Now, suppose that an administrator created a checkpoint for a Hyper-V virtual machine and then attempted to upgrade an application that was running on the VM. Let’s also assume that the application upgrade process has failed and that the virtual machine has been left in an undesirable state. The administrator can easily restore the previous state by applying the checkpoint.

When the administrator applies the checkpoint, Hyper-V deletes the differencing disk and resumes read/write operations on the original virtual hard disk (there are actually several checkpoint options, but this is simplest use case). At this point, the virtual machine returns to normal operation.

Why are checkpoints not a substitute for backup copies?

The main reason why checkpoints are not an effective replacement for backups is that the checkpointing process does not create a copy of the virtual hard disk. Therefore, checkpoints do not offer protection against physical disk failure or against damage to the virtual hard disk file. If a virtual hard disk were to be damaged or destroyed, the virtual machine’s snapshots of the virtual machine would be useless because they have a dependency on the virtual hard disk file.

Another thing to bear in mind is that the differencing disks used by the checkpointing process are usually stored on the same physical volume as the virtual hard disk. Therefore, if the volume were to be damaged, there is a good chance that both the virtual hard disk and the differencing disks would be lost.

Limited recoverability

Another important reason why checkpoints are a poor substitute for backups is that they do not allow for single item recovery. A checkpoint can be used to roll back an entire virtual machine to an earlier state, but it cannot be used to roll back an individual file or application on that virtual machine.

This brings up another important point. Checkpoints can cause significant problems for application servers. Early versions of Hyper-V were known for causing data corruption issues when checkpoints were applied to application servers. Microsoft eventually solved the issue when it introduced production checkpoints. Still, production checkpoints do not resolve all of the potential consistency issues that checkpoints can cause.

Many application servers have dependencies on other servers. An application might, for example, be linked to a SQL Server, a Web interface or an LDAP server. If a checkpoint were to be used to roll back an application server, there is a chance that a consistency problem would occur because other dependency servers were not rolled back. Of course, the chances of this happening vary depending on the application and the role that the virtual machine is performing, but application consistency should always be a consideration when using checkpoints.

Checkpoints can hurt virtual machine performance

One of the main reasons why checkpoints should be used with caution is that they can significantly degrade the performance of a virtual machine. The performance impact may not be significant at first, but as additional checkpoints are created, the virtual machine’s performance tends to drop off significantly.

The reason why checkpoints can affect performance has to do with the way they work. As previously noted, checkpoints redirect write operations to a differencing disk. So with that in mind, consider what happens when a virtual machine performs a read operation.

The VM attempts to read data from the differencing disk (remember, the differencing disk contains the most recent data). If the VM does not find what it is looking for on the differencing disk, then it attempts to read the data from the original virtual hard disk.

Now, suppose that an administrator creates an additional checkpoint for a virtual machine. Hyper-V will treat the differencing hard disk as a read-only disk and create a new differencing disk. From then on, all write operations will now be directed to this differencing disk. The following diagram shows this action:

Now, when the virtual machine attempts a read operation, it will first look for the data on the most recently created differencing disk. If that differencing disk does not contain the requested data, then the virtual machine will look to the previously created differencing disk. If it doesn’t find the data there, then Hyper-V will attempt to read the data from the original virtual hard disk. Therefore, each checkpoint that is created has the potential to further degrade the virtual machine’s read performance. This ultimately ends up providing diminishing returns.

All said, checkpoints and snapshots are excellent at doing their job (but not as backups), but they should be kept properly managed and maintained as if they were.

Want to learn more without any obligation to purchase? Don’t delay!

¡Contact us!