Most of our production servers are at our main co-location facility. All these servers have second ethernet cards so monitoring and between-server file transfers will not interfere with normal web and e-mail traffic. This “shadow network” is used for our nightly backups.
One server has a lot of disk capacity and is the backup device. For years we did a monthly/weekly/daily archive of each server over the shadow network.
However we expanded to a second co-location facility, and also run a “grid” machine at a 3rd site. Backups now started eating a lot of bandwidth, and the monthly backups were not completing during the overnight hours.
So what to do? Web sites consist mostly of files that do not often change, so really most of the files do not have to be backed up daily.
Our solution was to have our backup system mirror the other server files, and then at night only copy over the changed files. We used the rsync utility to determine and copy the changed files, and then a daily archive is created that compresses the files into a single archive. This solution also means less processing on the production servers.
Once the files are synchronized on the backup server, then a compressed archive is created and stored away.
There is also a filtering done on the files so that we do not back up temporary files or non-critical system files.
The end result is that we use our bandwidth packets sparingly. We have backup archives, without saturating our Internet connections getting them offsite.