IntroductionThis is a general overview paper, part of a series of articals on backing up computers. The top page is Design for an Archiving Backup System.
As I see it most backup approaches employed in the home or small business are falling into one of the following areas:
For the home user, who may only have a single machine the first two approaches appear to be moderately acceptable and cost effective using a CD-R or DVD-R type drive, but the third approach really needs a tape drive and may prove to be rather expensive. Once you need to backup more than one or two machines on a regular basis the CD-R or DVD-R type drives become too cumbersome (especially the CD-R with its smaller capacity) to use, so larger capacity devices (like tape drives or removable disks) become attractive.
Much the same can be said for the small business environment, except now the financial costs of performing the backups on a regular basis need to be weighed against the cost of lost data (for example due to a disk failure, accidential deletion, fire, theft or flood, hurricane, tornado, earthquake, meteor strike, war).
For the home user the financial costs of lost data are hard to evaluate, some even look on a disk failure as an opportunity to upgrade a system and get a clean start. However, with the advent of wide spread digital photography the need to reliably backup family photos is rising, and the difficulty of doing a good job of this is also rising because of the volume and size of the photos that are taken.
See also my arcvback backup program, where the download and manual are available.
RequirementsThere are various requirements for backup software. Not all forms of backup will meet these, but understanding what the common requirements are will help you in the selection of the type of backup software you need.
Bare Metal RestoreThis is often the toughest problem for backup software, when the main disk drive fails in a computer and you replace it with a new (unformatted) drive, how do you take your collection of backup media and use it to rebuild the machine you once had?
The disk-image form of backup software excels at this. The traditional file-based backup software makes this (typically) difficult to do (some commercial systems have a "disaster recovery" option to better address this issue). The problem with the traditional approach is that before you can run the restore job you need to install (at least) an operating system, and that can take some time.
Another approach is to be able to run the restore software from a standalone (bootable) CDROM allowing one to install a bare drive, boot from the CDROM and then format the drive as desired and restore the files to it. Backup software that can restore from a USB or network attached drive can make a restore proceed much faster.
User Data RestoreRestoring lost user data (in the case of a failed non-system drive or user or application error) is the problem that traditional backup software does a good job of. It is typically limited in the depth of time over which recovery is possible.
Old Version RecoveryIn certain cases it may be necessary to recover very early versions of files or documents. Often times one avoids this by periodically copying and renaming a significant document, so that some of the earlier document versions are still available; however, as this is a manual process it will generally fail at some time. This sort of recovery capability is of more importance in the corporate environment than the home environment.
Typically providing the necessary depth of backup coverage is quite expensive when using traditional backup software as this inflates the media requirements dramatically.
Backup Size and TimeThe volume of data that must be backed up can have a dramatic impact on the design of a backup plan. It directly affects cost by dictating the required capacity of the storage media and the type of hardware needed to support it. It may cause compromises to be made in the frequency and type of backup that is performed. It may impose restrictions on the users of equipment being backed up (to ensure that they are not preventing files from being backed up). It may cause heavy network traffic and may necessitate network reorganization or upgrades to provide the needed capacity.
RobustnessBackups need to be robust, ideally they also need to be tolerant of media failure. For example if a single backup spans several media the loss of any one piece should not prevent the recovery of the data on the remaining pieces. However, if the storage costs permit, it is a very good idea to have a backup system that employs multiple copies of redundant media. That way, if a piece of media is lost or damaged a copy of its data still exists on another media piece.
Since all physical media has a finite lifespan it would also be a good idea to allow a piece of media to be replaced by newer media at a later date.
Utilities to verify the integrity of data on old media may also be useful.
Non-proprietary backup files formats may also be useful, especially when trying to extract some data from a damaged backup file.
Fire, Theft and FloodThese sort of risks are usually addressed by arranging to have a copy of the backup media placed in storage at one or more remote sites. This sort of risk is often overlooked in the home environment, but now that unique data sets (such as the family photographs) are becoming common place this issue should now be considered. Consider the case of film director Francis Ford Coppola who had his computer and backup device stolen in Sept'07.
Ease of UseFor a backup system to be effective it must be used on a regular basis. Ask any owner of a Palm Pilot if he has ever got that "sinking feeling" that he really should have done a hot sync... There are some issues here:
Existing SolutionsThis section provides background information on the various types of existing backup solutions.
TraditionalTraditional backup is what I am calling "file based backup using a rotation of media". In it a program runs that periodically visits all the files on the drives (partitions or subdirectories) that have been identified as needing backup. At each file it checks to see if something has changed (by looking at the archive bit or the last modification time stamp) since the last time the file was backed up. If the file has been modified then it saves a new copy of the file. This process typically is done in two phases, a full backup (which backs up all files, and thus, takes a lot of time and storage) which is run once a week and a daily backup which only backs up the files which have changed since either the last daily backup (incremental mode) or last full backup (differential mode) was done. The ability to go back in time and recover a particular earlier version of a file is determined by the number of full backups that are done and how often the incremental or differential backups are performed.
In a traditional rotation one might have three sets of weekly full backup tapes, and about 6 days worth of incremental tapes, this would allow you to recover a different end-of-day version of a file for the last week, but only once a week for the preceeding two weeks. This also provides a degree of redundancy in the (somewhat likely) event that something goes wrong with the weekly backup there are still two other (somewhat recent) versions of the weekly backup that could be used to salvage most data. One might add another level to this by adding a monthly and/or yearly full backup rotation (perhaps for off site storage).
The traditional backup recognizes the fact that most files actually remain unchanged for long periods of time, and so the amount of data that needs to be backed up on a daily basis is much smaller and exploits this through the incremental or differential modes. Since the incremental mode only backs up changes made since the last incremental pass (or full pass) it will stay relatively small over long periods of time, but because a restore job may have to access many (or all) of the incremental tapes between now and the last full backup it becomes a matter of user inconvenience that dictates how many incremental backups can be tolerated. This inconvenience is why the differential backup scheme was invented, though it is less efficient, having to backup more and more data as the time since the full backup gets larger.
The user convenience factor also come into play in how effectively the media is utilized. Typically the same size of media is used for both full and incremental backups, this means that there might be enough space on one tape to store several daily back runs, however, because of the difficulty of locating a particular set on a tape the software will often just waste the remainder of the tape and have the user just have a different tape for each day of the week.
The traditional backup also wastes a lot of time during the full backups (again to provide user convenience), this is because most of the files that are placed on the second week's full backup set were already on the previous week's full backup. Because of cost one cannot keep a weekly set of tapes around forever so this periodic recopying of all the data to allow media to be reused seems like a reasonable compromise. However, if you are manually changing tapes (and using a slower, more cost effective, tape technology) this means once a weekly backup exceeds two or three tapes in size it starts to get rather inconvenient and requires a long time to execute.
Drive ImagingDrive imaging backup is the block for block copying and restoring of a hard drive's data. Because this happens at the block level (below the disk formats imposed by the operating system) this is usually a snapshot technique that must be done when the machine is running in some special standalone mode (for example booted into another operating system loaded from CDROM or floppy). Recently there have been a number of advances in these tools to allow for this to be done while the machine is still running its regular operating system (which sounds rather risky), to only store the blocks on the disk that the file system is actually using (which saves a lot of backup space if the disk is only partially full) and to allow for the backup image to be browsed and individual files within it to be restored. There is also the possibility of providing an incremental approach to drive imaging, whereby only the blocks that have changed since the last image was taken need to be saved.
This form of backup is best used to protect the operating system drive or partition of a machine to allow it to be placed back into service quickly and cheaply without having to go through the tedium of reinstalling the operating system and all of the applications. Combining this with traditional backup of the user data areas would seem like the best all round approach.
Design your disk partitions to support the imaging backup system (by reducing tha amount of data that is kept on the operating system partition). If you keep the C: partition for just the operating system and installed applications, and have another partition for the user data files, then you can minimize the amount of data that the imaging backup needs to copy (restricting it to just the C: partition). Unfortunately Windows gets in the way of this by placing the "Documents and Settings" directory on the C: drive (not to mention always wanting to take up the full hard drive when it is installed).
RedundancyRedundancy in the backup process is typically seen in the traditional approach, but only in a partial form, through the way that the full backup sets contain a lot of the same data. It can be added in a true form by performing multiple full backups in a row, or by duplicating the original set of backup media.
A backup approach that provides redundancy without imposing a lot of additional work or cost would be useful even in the home environment as this is what is needed to protect against perils such as fire, theft and flood.
The main problem with redundancy is the added media cost. It also increases the inconvenience factor due to the additional backups that are done and also the additional trips that need to be made for offsite storage.
Traditional backups often implement a pseudo-redundancy by having several media sets that are used in rotation. For example you might have a weekly full backup that has a 4 set rotation, meaning that on the 5th week you over write the backup that was done in the first week. This is not true redundancy because if there is a file you need from a particular media set and it turns out that set is damaged, then if the file only existed on that particular week you won't be able to recover it. To get around this one might duplicate the media once it has been written, doubling the media count.
Cache DrivesCache drives can be used to decouple the backup operation from the act of recording the back up data to removable media. When this is done the backup writes to a large cache drive (which these days is pretty inexpensive) so it can run at full speed, and then later the data is saved from the cache to slow media such as tape. This also allows the tasks that need operator intervention to be done at a convenient time, which may mean that an expensive upgrade to a robotic tape changer or to higher capacity tapes can be delayed or avoided.
The presence of the data on a cache drive for some period of time adds a small element of risk to the system, but the failure of the cache drive alone will not loose any data, one would also have to fail (erase) the original file at the same time. Placing the cache directory on a RAID protected drive would greatly reduce this already small risk, as would using backup software that writes two copies, to two different cache devices.
Cache drives also allow for the possibility of doing restores directly from the cache (especially if the cache is quite large) which can save a lot of time, and allow for the possiblity of very convenient user-driven restores that don't need access to the backup media or devices.
Use of a cache drive may also improve the utilization of backup media, since one could delay the flushing of the cache until there is enough data to fill a piece of media. Of course there is some risk with this approach, a backup version could be lost if the cache drive fails before it is flushed. This risk may be tolerable as the original data should still be on its drive (unless this is the same drive as the cache uses).
Disk Based BackupIn recent years the falling costs of IDE hard drives have brought about the odd situation of disk storage being the same price or even less expensive than tape storage on a $/GB basis, especially in the high capacity ranges. It appears likely that disk prices will continue to drop, while tape prices will not move much in the future. As a result the temptation to use disks to replace tapes is going to grow.
ArchivalAn archival backup system is one which (in its truest form) never overwrites backups. This means that old versions of changed files and even files that were deleted and never replaced a long time ago can still be retrieved from storage. Archival storage seems to be ignored as being too expensive on media to implement, or too inconvenient to use on a wide scale.
However, it appears that with low cost media such as DVD-R we may have reached a point where it is cost effective and convenient to replace a tape based traditional backup system with a DVD-R (or RW) based archival system for certain sizes of systems. This will become more true (and applicable to a larger group of systems) in the future as the price of DVD media continues to drop and the capacity of this type of media rises with the advent of multi-layer and blue laser recording.