Tuesday, July 29, 2008

Full backups of virtual machines and Windows VSS

Introduction
One of the new features that is appearing in backup products that take backups of an entire virtual machine, as opposed to using an agent inside the guest operating system, is the ability to cooperate with Windows VSS (Volume Snapshot Service) inside the guest. For example, the recently released version of VMWare's Consolidated Backup 1.5, now supports VSS quiescing for Windows 2003, Windows Vista, Windows 2008; vizioncore's vRanger Pro backup utility has been supporting VSS for Windows 2003 for some versions already.

Several opinions exist on whether this is in fact a useful feature or not; for example, not so long ago the developers of esXpress talked about not including VSS quiescing into their product at that time because it adds additional complexity and does not offer any significant benefits in their opinion (see here). This discussion is still alive as you can see for example here, and the big question is indeed: can you rely on live backups of database virtual machines?

The early days of VSS
The root of the discussion is at the intended use of VSS: on a physical machine that is running a database application such as SQL Server, Exchange or even Active Directory or a DHCP server for that matter, you cannot directly read the database files since they are exclusively locked by the database application. This used to be particularly troublesome because the only way to get a backup of the data inside such a database is to use some sort of export function that had to be programmed into the database application (think of the BACKUP TSQL command or a brick-level backup of an Exchange server).

Microsoft tackled this problem by introducing VSS, which presents a fully readable point-in-time snapshot of a filesystem to the (backup) application that initiates the snapshot. That way, a backup application can read the database file contents and put it away safely in case it is ever needed.

However, there are two problems when reading files from a filesystem that is "frozen" in time:
  • a file can be in progress of being written (i.e. only 400 bytes of a 512-byte block are filled with actual data).
  • data still in a filesystem cache or buffer in memory and not yet written to the disk (in the filesystem journal).
On top of the filesystem issues, there are two problems when reading a database that is still in use but "frozen" purely at a filesystem level:
  • at the time of the snapshot, a transaction could still be in progress. This can be an issue when the transaction is not supposed to be committed to the database at the end: as you know, a database query can initiate thousands of changes and perform a ROLLBACK at the end to reset any changes made since the start of the transaction.

    A good (ficteous) example here is when you try to draw 1000 euros in cash from an ATM: if you change your mind right before clicking the "confirm transaction" button on the ATM screen, then you don't want your 1000 euros to be really gone if at the same time a database snapshot is taken and your final "ROLLBACK" command is not included in the database!

  • some data could still be in memory and not written to a logfile or a database file (so-called "dirty pages").
Crash consistency versus transactional consistency
If you don't take these four problems into account, then restoring a snapshot of such a filesystem would be in fact the same as bringing back up the server after you suddenly pulled the power plug. Such a snapshot is said to be in a crash-consistent state, i.e. the same state as a sudden power-loss.

Modern filesystems have built-in mechanisms (so-called "journalling") to tackle these problems and to ensure that when such a "frozen" filesystem is restored from a backup, the open files are put back in a consistent state as possible. Obviously, any data that only existed in memory and never was written to a filesystem journal/disk is lost. Databases rely on transaction logging to recover from a crash-consistent state back to a consistent database; this is typically done by simply rolling back all unfinished transactions, effectively ignoring all transactions that were not committed or rolled back.

Windows VSS wants to go beyond a crash-consistent snapshot and solves both the filesystem and database problem by not only freezing all I/O to the filesystem but also asking both the filesystem and all applications to flush its dirty data to disk. This allows the creation of both a filesystem consistent and an application-consistent backup. VSS has built-in support for several Windows-native technologies such as NTFS filesystems, Active Directory databases, DNS databases, ... to flush their data to disk before the snapshot is presented to the backup application requesting the snapshot. Other programs, such as SQL/Oracle databases or Exchange mailservers, use "VSS Writer" plugins to get notified when a VSS snapshot is taken and when they have to flush their dirty database pages to disk to bring the database in a transactionally consistent state.

From Technet:

[...] If an application has no writer, the shadow copy will still occur and all of the data, in whatever form it is in at the time of the copy, will be included in the shadow copy. This means that there might be inconsistent data that is now contained in the shadow copy. This data inconsistency is caused by incomplete writes, data buffered in the application that is not written, or open files that are in the middle of a write operation. Even though the file system flushes all buffers prior to creating a shadow copy, the data on the disk can only be guaranteed to be crash-consistent if the application has completed all transactions and has written all of the data to the disk. (Data on disk is “crash-consistent” if it is the same as it would be after a system failure or power outage.). [...] All files that were open will still exist, but are not guaranteed to be free of incomplete I/O operations or data corruption.

Under this design, the responsibility for data consistency has been shifted from the requestor application to the production application. The advantage of this approach is that application developers — those most knowledgeable about their applications — can ensure, through development of their own writers, the maximum effectiveness of the shadow copy creation process.

Conclusions for the physical world: the above makes clear that there is a huge benefit in using VSS when working on physical machines: VSS is a requirement to be able to backup the entire database files and to ensure that the database is not in an inconstent state when you want to do the restore the database- and logfiles and attempt to mount them. The main advantage here is that a restored database does not have to go through a series of consistency checks that typically take up many, many hours.

Going to the virtual world
In the virtual world, there are several different types of backups that can be performed:
  • Performing the backup inside the guest OS.
  • Performing a backup of the harddisk files (VHD/VMDK) when using a virtualization product that is hosted on another operating system, such as Microsoft Virtual Server or VMWare Workstation/Server.
  • Performing a backup of the harddisk files (VHD/VMDK) when using a bare-metal hypervisor based product such as Microsoft Hyper-V or VMWare's ESX/ESXi Server.
Obviously, when you perform the backup inside the guest OS, you still encounter the same problems as when attempting to back up a physical host: open files and database files are locked and thus cannot be backed up directly, so you have to revert to using VSS for the reasons discussed above.

But what about the other two ways of performing a virtual machine backup, when attempting to back up the entire harddisk file? For starters, it is important to realize that "file locking" now occurs at two levels:
  1. The VHD/VMDK harddisk files themselves are opened and locked by the virtualization software (be it the hypervisor for bare-metal virtualization or the executable when using hosted virtualization);
  2. Files can be opened and locked inside in the guest operating system.
The first issue of the open VHD/VMDK harddisk files is solved depending on the virtualization product: if you are using host-based virtualization, you can obtain a readable VHD/VMDK file by using VSS on the host operating system and asking to present an application-consistent variant of the VHD/VMDK files. If you are using a bare-metal hypervisor, a typical mechanism is by taking a snapshot of a virtual machine (which, for example in VMWare ESX, shifts the file lock from the VMDK file to the snapshot delta file, thus releasing the VMDK file for reading).

Open files inside the guest OS
Ironically, the solution of the first problem of open VHD/VMDK host files introduces the second problem of open files inside the guest os: once you have your snapshot of the VHD/VMDK files (be it through VSS for host-based virtualization or a VM snapshot for bare-metal hypervisors)... that snapshot is only in a crash-consistent state! After all, it is a point-in-time "freeze" of the entire harddisk and restoring such an image file would be equivalent to restarting the server after a total powerloss occured.

VMWare attempted to tackle this problem by introducing a "filesystem sync driver" in their VMTools (which you are supposed to install in every virtual machine running on a VMWare product). This filesystem sync driver mimics VSS in the sense that it requests that the filesystem flushes its buffer to disk, guaranteeing that the snapshot -- and thus corresponding full virtual machine backup -- is in a filesystem consistent state. Obviously, this does not solve the problem for databases which tend to react quite violently to these kind of non-VSS "freezes" of the filesystem. Prototype horror stories can be read here (AD) and here (Exchange).

So what are the real solutions for this problem? I can think of two at this moment:
  1. After taking a snapshot, do not only backup the disks but also the memory. Then, when restoring the backup, do not "power on" the virtual machine but instead "resume" it. At first, the machine will probably be "shocked" to see that the time has lept forward and that many TCP/IP connections are suddenly being dropped, but the database server you are running should be able to handle this and properly commit any unsaved data from memory to disk.

  2. Trigger a VSS operation inside the guest OS to commit all changes to disk and ensure filesystem- and applicationlevel consistency, and only then take the full virtual machine snapshot.
The VSS interaction with the guest operating system was first introduced by vizionCore in their vRanger Pro 3.2.0 -- which required the installation of an additional service inside the guest VM, .NET 2.0 and was only officially supported for Windows 2003 SP1+ in 32bit. With the release of VMWare Consolidated Backup 1.5, VMWare announced the default queiscing of disks on ESX 3.5 Update 2 would now be done using the new VSS driver -- supported on Windows 2003/2008/Vista in both 32 & 64-bit variants. Hurray! Problem solved, right?

So VSS seems nice, but is it necessary?
Obviously, your gut feeling will tell you that it is "nicer" and "more gentle" to the guest virtual machine when using VSS when taking a snapshot and a backup. The arguments on the difference between crash-consistency, filesystem consistency and application-level consistency (which translates to transactional consistency for databases) give solid grounds to this gut feeling.

Personally, I cannot find an argument that states that VSS is also really necessary to create a full virtual machine backup. In the physical world, filesystems and databases have been hardened to recover from the crash-consistent state that you obtain when taking a snapshot of a running virtual machine to back up and restore. Hands-on experience about this robustness can be read on several informal channels such as forum posts here.

However, if you want to be sure that your database is in a consistent state (for a faster recovery) and certainty that those few seconds of data that were not yet committed from memory to disk are in fact included in your snapshot, then VSS is what you need. The next question to answer is: what is the risk of VSS messing up and is this probability larger than not being able to restore a non-VSS-based snapshot?

Conclusion
Performing live backups of virtual machines seems like an interesting and simple feature of virtualisation at first. However, at a second glance, there are some important decisions to be made regarding the use of VSS/snapshotting technology that can impact your restore strategy and success. Even without any quiescing mechanism, the operating system should be able to handle the crash-consistent backups that are taken by performing live machine backups and should therefore be sufficiently reliable. With the ready availability of VSS in the new VMWare Tools that come with ESX 3.5 Update 2, much more than crash-consistent backups can be guaranteed without the need to install additional agents. The increased reliability and faster restore time (no filesystem/database consistency checks) that come with VSS quiesced snapshots make full virtual machine backups now a fully mature solution without the need to worry for possibly inconsistent backups.

Side remarks
Some additional remarks regarding full virtual machine backup:
  • Full VM backups can be an addition to guest-based file level backups, but they can never be a complete replacement:

    • you might take a full VM based snapshot of your Exchange or SQL database every day, but a filebased/bricklevel backup (which is far more convenient to use for your typical single file/single mailbox restore operations) might be taken several times a day, depending on the SLA that your IT department has with the rest of the company.

    • a full vm backup is a good place to start a full server recovery. It is a bad place to start a single-file or a single mailbox restore.

    • a full VM backups using VSS do not allow the backup of SQL transaction logs (see "what is not supported" in the SQL VSS Writer overview), nor do they commit transaction logs to the database in order to clear up the transaction logs (an absolute necessity for Exchange databases or for several types of SQL databases).

  • Microsoft does not support any form of snapshotting technology on domain controllers. For more information, see MSKB 888794 on "Considerations when hosting Active Directory domain controller in virtual hosting environments".
Edit (12 Aug 2008): VeeAm has released a very interesting whitepaper that discusses not only the necessity for VSS awareness during the backup process, but also during the restore process. They give the example of a domain controller that performs USN rollbacks when being backed up using VSS but not restored using a VSS aware software. Another nice example is Exchange 2003 that requires VSS aware restore software in order to be supported by Microsoft.


Postscriptum: I started writing this article a few days before VCB 1.5 was released, and the original point I was trying to make at that time was that there were too many disadvantages to the available VSS implementations (yet another service to install, .NET 2.0, very limited OS support) to really profit from the benefits that VSS could offer. Of course, in the meantime, VMWare has taken away most of those objections by including VSS support in their VMTools for a wide range of server operating systems. This forced me to reconsider my view on whether VSS would be a good idea or not.

9 comments:

Toni Verbeiren said...

Great article (wouldn't call it a blog post anymore!). Thanks!

Anton Gostev said...

Good article indeed, and I am glad you liked my whitepaper. :)

I also have a blog post with videos illustrating the issue described in the whitepaper, here's the link in case you are interested http://www.veeammeup.com/2008/08/vss-and-vmware-esx-what-your-vmware.html

Louw said...

This article is a MUST read for all backup administrators and managers for better understanding of backups and recovery of todays complex systems and gives one a look into the what the future developments should be focused on.

Sven Huisman said...

Good job on writing this article. Keep up the good work!

SelfSRR said...

hey , this article should be in some textbook. i had lot of problems understanding the different levels of consistency based on varios articles in the web. this article gives a very good and neccessary overview of the basics.

keep it up.

bronkupper said...

Hello,

Reading your excellent article I was left wondering about your thoughts on backing up the memory of the VM as well as the virtual disk.

On the surface it seems like a great tool to quickly achieve full state backup that is independent from internal guest implementations (VSS Writers etc...) and perhaps even less risky in those terms.

I think it should be fairly easily done as I can't think of much technical difficulties greater than those solved by "Live-Migration" and likes.

What are your thoughts about that?

Are there any pitfalls in using a method like that on databases/AD/etc?

Anonymous said...

absolut ultimate, thx for your job.

Carl Dipane said...

Having worked in the backup, recovery and DR area for 'too' many years, I can say I've never seen an article thats bought ALL the concepts, theorey and practise of Application consitency into one article.

I support one of the other posters: This should be published whitepaper!.


Carl Dipane CD-DataHouse

pvl said...

I know this is an old post - but it does not seem to be an issue/question that has gone away. I've also thinking that your article is missing just a bit of information.

When vmtools began to include VSS integration, I think (according to what even you said) that VSS only takes care to properly flush/quiesce the file system and dns/dhcp and possibly AD databases.

However, we need a VSS writer, in order to properly protect (with full transactional/application consistency) MS SQL and Exchange.

Correct?

So my question for today, is:

Does the current implementation of vmtools/vmware replication/srm include VSS writer functionality so that SQL and Exchange dbs can be flushed/quiesced and possibly have logs truncated?

I am leery of telling customers that "oh, yea - don't worry, the dbs will be properly protected" if this is not the case?