Experience shows that you can never be too paranoid about system backups. When it comes to protecting and preserving precious data, it is best to go the extra mile and make sure you can depend on your backups if the need arises.
Even today, when some cloud and hosting providers offer automated backups for VPSs at a relatively low cost, you will do well to create your own backup strategy using your own tools in order to save some money and then perhaps use it to buy extra storage or get a bigger VPS.
Sounds interesting? In this article, we will show you how to use a tool called Duplicity to backup and encrypt files and directories. In addition, using incremental backups for this task will help us to save space.
That said, let’s get started.
Installing Duplicity Backup Tool in Linux
To install duplicity in RHEL-based distros, you will have to enable the EPEL repository first (you can omit this step if you’re using Fedora itself).
On RHEL 9:
subscription-manager repos --enable codeready-builder-for-rhel-9-$(arch)-rpms dnf install https://dl.fedoraproject.org/pub/epel/epel-release-latest-9.noarch.rpm
On CentOS 9, AlmaLinux 9, Rocky Linux 9:
dnf config-manager --set-enabled crb dnf install epel-release
On RHEL 8:
subscription-manager repos --enable codeready-builder-for-rhel-8-$(arch)-rpms dnf install https://dl.fedoraproject.org/pub/epel/epel-release-latest-8.noarch.rpm
On CentOS 9, AlmaLinux 9, Rocky Linux 9:
dnf config-manager --set-enabled powertools dnf install epel-release
Then run,
dnf install duplicity
For Debian-based distributions such as Ubuntu and Linux Mint.
sudo apt update sudo apt install duplicity
In theory, many methods for connecting to a file server are supported although only ssh/scp/sftp, local file access, rsync, ftp, HSI, WebDAV, and Amazon S3 have been tested in practice so far.
Once the installation is completed, we will exclusively use sftp in various scenarios, both to back up and restore the data.
Our test environment consists of an RHEL 8 box (to be backed up) and a Debian 11 machine (backup server).
Creating SSH Keys for Passwordless Login to Remote Server
Let’s begin by creating the SSH keys in our RHEL box and transferring them to the Debian backup server.
If you are running SSH on a different port, then the below command assumes the sshd daemon is listening on port XXXXX in the Debian server. Replace AAA.BBB.CCC.DDD with the actual IP of the remote server.
ssh-keygen -t rsa ssh-copy-id [email protected] ssh-copy-id -p XXXXX [email protected]
Then you should make sure that you can connect to the backup server without using a password:
ssh [email protected]
Now we need to create the GPG keys that will be used for the encryption and decryption of our data:
gpg2 --full-gen-key
You will be prompted to enter:
- Kind of key
- Key size
- How long the key should be valid
- A passphrase
To create the entropy needed for the creation of the keys, you can log on to the server via another terminal window and perform a few tasks or run some commands to generate entropy (otherwise you will have to wait for a long time for this part of the process to finish).
Once the keys have been generated, you can list them as follows:
gpg --list-keys
The string highlighted in yellow above is known as the public key ID, and is a requested argument to encrypt your files.
Creating a Linux Backup with Duplicity
To start simply, let’s only backup the /var/log directory, with the exception of /var/log/anaconda and /var/log/sa.
Since this is our first backup, it will be a full one. Subsequent runs will create incremental backups (unless we add the full option with no dashes right next to duplicity in the command below):
PASSPHRASE="tecmint" duplicity --encrypt-key 115B4BB13BC768B8B2704E5663C429C3DB8BAD3B --exclude /var/log/anaconda --exclude /var/log/sa /var/log scp://[email protected]//backups/rhel8 OR PASSPHRASE="YourPassphraseHere" duplicity --encrypt-key YourPublicKeyIdHere --exclude /var/log/anaconda --exclude /var/log/sa /var/log scp://root@RemoteServer:XXXXX//backups/rhel8
Make sure you don’t miss the double slash in the above command! They are used to indicate an absolute path to a directory named /backups/rhel8 in the backup box which is where the backup files will be stored.
Replace YourPassphraseHere, YourPublicKeyIdHere, and RemoteServer with the passphrase you entered earlier, the GPG public key ID, and with the IP or hostname of the backup server, respectively.
Your output should be similar to the following image:
The image above indicates that a total of 86.3 MB was backed up into a 3.22 MB in the destination. Let’s switch to the backup server to check on our newly created backup:
A second run of the same command yields a much smaller backup size and time:
Restoring Linux Backups using Duplicity
To successfully restore a file, a directory with its contents, or the whole backup, the destination must not exist (duplicity will not overwrite an existing file or directory). To clarify, let’s delete the cron log in the CentOS box:
rm -f /var/log/cron
The syntax to restore a single file from the remote server is:
PASSPHRASE="YourPassphraseHere" duplicity --file-to-restore filename sftp://root@RemoteHost//backups/rhel8 /where/to/restore/filename
where,
- filename is the file to be extracted, with a relative path to the directory that was backed up
- /where/to/restore is the directory in the local system where we want to restore the file to.
In our case, to restore the cron main log from the remote backup we need to run:
PASSPHRASE="YourPassphraseHere" duplicity --file-to-restore cron sftp://[email protected]:XXXXX//backups/rhel8 /var/log/cron
The cron log should be restored to the desired destination.
Likewise, feel free to delete a directory from /var/log and restore it using the backup:
rm -rf /var/log/mail PASSPHRASE="YourPassphraseHere" duplicity --file-to-restore mail sftp://[email protected]:XXXXX//backups/rhel8 /var/log/mail
In this example, the mail directory should be restored to its original location with all its contents.
Duplicity Command Usage
At any time you can display the list of archived files with the following command:
duplicity list-current-files sftp://[email protected]:XXXXX//backups/rhel8
Delete backups older than 6 months:
duplicity remove-older-than 6M sftp://[email protected]:XXXXX//backups/rhel8
Restore myfile inside directory gacanepa as it was 2 days and 12 hours ago:
duplicity -t 2D12h --file-to-restore gacanepa/myfile sftp://[email protected]:XXXXX//remotedir/backups /home/gacanepa/myfile
In the last command, we can see an example of the usage of the time interval (as specified by -t): a series of pairs where each one consists of a number followed by one of the characters s, m, h, D, W, M, or Y (indicating seconds, minutes, hours, days, weeks, months, or years respectively).
Summary
In this article, we have explained how to use Duplicity, a backup utility that provides encryption for files and directories out of the box. I highly recommend you take a look at the duplicity project’s website for further documentation and examples.
We’ve provided a man page of duplicity in PDF format for your reading convenience, is also a complete reference guide.
Feel free to let us know if you have any questions or comments.
Hi,
Thanks a lot
Very useful article…
Putting the passphrase on the command line is a bad idea for security, as it’s visible to all users in the same host, via the ps command. Better to set the environment variable first, then run the command. Better yet, put both in a shell script with permissions of 700.
Hi,
Very useful tool
Thanks a lot…
How would you compare duplicity to bacula? I have been struggling to get bacula runni g on my Ubuntu 14.04 server. After reading your post, I am willing to try to use duplicity to backup my CentOS 7 laptop to the server.
Would it be better to use dedicated backup user instead of root for SSH connection? Or is root needed for some reason?