ArcvBack config.ini File

Copyright 2009 by Stephen Vermeulen
Last updated: 2009 Oct 18


Introduction

This is part of a series of articles on backing up computers. The top page is Design for an Archiving Backup System.

This article describes the configuration files for the ArcvBack backup software that I have written.

Technically there are two files that you may need to change to configure ArcvBack for your system.

service.py

If you are running ArcvBack as a Windows Service application then you may need to change a few things in the service.py file. Near the top of this file you will see the following lines:

sys.path = ["c:\\programs\\arcvback4"] + sys.path
scheduleFile = "c:\\programs\\arcvback4\\schedule.dat"
configFile = "c:\\programs\\arcvback4\\config.ini"

If you have installed your copy of ArcvBack in some directory other than c:\programs\arcvback4 then you need to correct the above three lines. Note the double backslashes are necessary because Python treats the backslash as a special escape character.


config.ini

Most, if not all, of the configuration effort will be spent on the config.ini file. A sample configuration file is included in the distribution zip file, it contains a lot of comments that explain what is going on in it. Comments are introduced by the ";" character. The file is divided into functional blocks, each block has a header that is like:

[backup]

The contents of a block depend on the purpose of the block, some are simple lists others have some structure. The various blocks are described in the following subsections.

Section: [backup]

This section contains a list of all the directories that the ArcvBack software is to backup. Each entry in the list is numbered sequentially starting with "1". The program will stop adding directories to its list when it reaches the first missing number, so if you have lines like:

1=\\orion\files
2=\\flare\e\marrieta
3=\\flare\e\stephen
;4=\\mars\c$\Documents and Settings
5=\\flare\c\Documents and Settings
6=\\mars\c$\Programs\arcvback4

it will only see the first three directories because the 4th is commented out; and thus, is missing.  These numbers are used later in the [schedule] section to define when each directory is to be backed up.

Note the use of UNC paths to the directories is now optional (in arcvback 3 and earlier it was required). A UNC path starts with two backslashes and then the name of the machine, then a drive letter or share name and then a regular directory path on that drive.

It should be possible to run this on a Linux box, in which case you can use the forward slashes that UNIX normally uses.

Section: [wol]

This was added in version 4.2.

This section is optional, you only need to configure it if you have machines on the LAN that need to be backed up but which are often shutdown or put into the sleep or hibernation states.

The WOL (for Wake On LAN) section identifies any machines that need to have a wake-up packet sent to them before being accessed. You need to give the machine name, its MAC address, and a nominal delay (in seconds) to wait between sending the packet and when arcvback.py or service.py tries to access the machine. An example of this would be:

[wol]
1.name=flare
1.mac=00-19-31-AC-36-A0
1.delay=60

2.name=mercury
2.mac=00-1A-D0-5A-DA-54
2.delay=60

Section: [excludedir]

This section is optional, it is used to specify a list of directories that should be excluded from the backup. You may want to configure directories here if some of the top level directories in the [backup] section contain sub-directories that should not be backed up. On Windows systems the recycler directory is a good one to exclude.

The paths in the excludedir section can (as of version 4.4) include wild cards, for example one can do something like this:

[excludedir]
1=\\vermeulen\c$\work\projects\*\*\*\Release
2=\\vermeulen\c$\work\projects\*\*\*\Debug
to exclude the "Release" and "Debug" subdirectories that VisualStudio C++ creates from the backup (as the contents of these directories can be quite large and are entirely machine generated from the source file).

Section: [excludefile]

This section is optional, it is used to specify a list of files that should be excluded from the backup. You may want to configure files here if some of the top level directories in the [backup] section contain files that should not be backed up. On Windows systems files like the pagefile.sys are good to exclude.

Section: [schedule]

This section is required only if you are running ArcvBack as a Windows Service. It tells ArcvBack what to backup and how often. Within the schedule blocks you can define one or more schedule events, as with the backup directories you give each a number, however each event is composed of a number of sub-items. A typical event appears here:

1.name=evening
1.cycle=day
1.rate=1
1.time=18:00 22:00
1.dirs=1
1.megabytesperrun=40000
1.versionbackup=on

the first line defines the name of the event, in this case "evening". The names of all the events need to be unique, but the actual names are up to you. Each event runs on a particular cycle which (when combined with the rate) specifies how often the event is to be run. The supported cycles are:
  • day: run once every "rate" days
  • mon, tue, wed, thu, fri, sat, sun: run once every "rate" weeks on the named day. If rate is set to 2 then run every two weeks on that day.
  • a numbered day (i.e. 1, 2, 3, ... 31) causes the event to be run on that day every "rate" month. For example if you set cycle=15 and rate=4 then on the 15th of every 4th month the event will be run.
  • hour: run the event every "rate" hours
  • minute: run the event every "rate" minutes (mainly for testing)
The time item defines the range of times that the event is allowed to run in during the day. If it is not supplied then the event can run at any time during the day. This uses the 24 hour clock, so specify in the format HH:MM (hours:minutes). Both the start and end time must be supplied with a space between them. If the event is still running when the time window ends it will be stopped and any remaining files that were not backed up will be recorded in a later pass of the event (or another event that also backs up the same directories).

The dirs item lists the directories (by their index number from the [backup] section) in a space separated list that should be backed up when this even is allowed to run. Note: it is allowable to have multiple events can specifing the same directories, this does not result in the directory being scanned multiple times.

The megabytesperrun item is optional (it defaults to 999000, i.e. 999GB) and it is overridden by the [database] section's megabytesperrun item (if that is more restrictive).  Use this if you want some of your backup jobs to only run for a short time.

The versionbackup item is optional (it defaults to "on"), it can be on or off and controls whether a zip backup of the version database is made at the end of the event. For backups that need to have a short run time you may want to set this to off.

Section: [mediadirs]

This section identifies the directories where the various package files will be written. The size is how many GB that must be left free on the drive the directory is on. At a minimum you need to define one directory, such as:

[mediadirs]
1.dir=\\mars\e$\backup\media
1.size=20

If you have a number of drives with free space you can distribute your backup cache across them by defining additional directories like:

2.dir=\\mars\f$\media
2.size=30
3.dir=\\mars\g$\cache
3.size=40

Section: [mediagroups]

Mediagroups arrange the mediadirs into groups, in each group the mediadirs are logically joined. During a backup the same data is written in parallel to all groups, so if one defines two or more groups the data will be redundantly stored. Sort of a poor-man's RAID-1 arrangement.

The following sets up two groups for redundancy:

[mediagroups]
1.dirs=1
2.dirs=2

If you don't need the redundancy then you can join a few mediadirs into a single storage unit. For example the following joins three mediadirs into a single group of space:

[mediagroups]
1.dirs=1 2 3

The simplest arrangement is the following, where there is just a single mediadir "1" placed into group 1 (this will probably be the typical arrangement):

[mediagroups]
1.dirs=1

Section: [run]

This section is optional, entries in this section will be executed as DOS commands after the service has waited for the database.delay time to elapse. This way I can have the service execute the appropriate net share command when it starts. So for my config.ini I have added:

[run]
1=net share m$=m:\

you might want to use this to run some commands to delete temporary files before running the backup.

Section: [database]

The database section of the file contains a number of global configuration parameters. These are:
  • dir: the directory where the version database will be stored
  • backupdir: the directory where backups of the version database are stored
  • backupcount:  the number of backups of the version database to keep
  • backupdays: (which defaults to 10), this sets the minimum number of days old backup zip files of the version database should be kept for. It takes precedence over the backupcount setting, this is to protect against something going bad with the database and if you have arcvback set to run many times a day you might overwrite all the database backups before you realize there is a problem. Of course, with the switch to ZODB this really should not be necessary, but better safe than sorry.
  • errorlog: the name of the log file to write errors into
  • megabytesperrun: global maximum number of MB to backup at any one time, if this is smaller than the values in the schedule then this is used
  • packagesize: the number of MB to write into a single pkg file
  • loglevel: the level of logging you want, set to 0 most of the time, set to 1 or 2 when you are trying to figure out a problem
  • delay: defaults to 1, this is the number of seconds the service should wait upon start-up before executing any commands in the [run] section, which happens before any scheduled backup events are processed. Note the delay is applied only when the service starts, not each time it is waiting between scheduled events.



                back to arcvback.com home