We recently started using duplicity for backups. Duplicity is a really nice backup utility which makes it a piece of cake to take full and incremental backups. It uses rsync in the background but adds a whole lot of clever features as well. One can take full backups once a month and then go on to use incremental backups every night so that less resources are used. Duplicity also has many interfaces for the backup service, it can use ssh or ftp or even Amazon's S3 service to store the backups on the cloud.

Having an easy way to restore from the backups is the other side of taking backups. With duplicity this is handled easily as well because you can say "get me this file/directory from 5 days ago". If something goes wrong you can easily restore the files and go on with your work with minimum downtime. Note that backups are not to be used in place of redundancy. In redundancy you can have more than one server doing the same stuff, having the same files but they will both suffer from a failure in the application. So for example, if one of your customers accidentally deletes his account then your redundant servers will not question the action and both will lose all the account info. When that happens you would want your backups to be there.

One trouble with backups is that they are Disk/CPU intensive. This could cause problems because you will generally want to backup important stuff on live servers. There could be workarounds to this but not all the time. For example, if you want to backup your database and you are using database replication, then you can take backups from the slave and not from the live server. But for most other backup types you will not have this flexibility. What do you do with backup processes on live servers?

Two system utilities that come on Linux are "nice" and "ionice". They allow you to fine tune the priority of your programs. Using these two commands you can lower priorities of programs so that they run only when the CPU or disk has less load. On a live server this could be vital.

On my experiments I've found that I don't need encryption duplicity offers because backup server is also our own box and since backups are sent over ssh there's no possibility of eavesdropping. You can disable encryption with the --no-encryption option. The upside of this is that duplicity will not run gpg and use processing power which encryption needs. In the end the duplicity command I ended up using is similar to this:

$ sudo ionice -c3 nice -n19 duplicity -v 4 --no-encryption backupdir scp://myuser@myserver.com//opt/backupdir

"ionice" with -c3 makes it run only when there are no other processes asking for IO. For "nice", I've used the lowest priority of 19 to run the process. You can check out a running process's "ionice" priority value by using -p switch:

$ ionice -pMYPID

and its nice value by using the "renice" tool.

I love the flexibility Linux offers to developers. Having these tools makes taking backups less of a hassle.