How To Use Duplicity with GPG to Securely Automate Backups on Ubuntu

 

PostedSeptember 19, 2013 136kviews SECURITY BACKUPS UBUNTU

Introduction

Duplicity is a versatile local and remote backup program that can implement a number of transfer protocols and third-party storage solutions.

In this guide, we will discuss how to install duplicity on an Ubuntu 12.04 VPS. We will be installing from source and then configuring it to take advantage of GPG encryption.

To follow along, you will need access to two machines, one Ubuntu 12.04 VPS, which will be backed up, and a second Linux machine or VPS of any variety that can be accessed by SSH.

How To Install Duplicity from Source on Ubuntu

We are using an Ubuntu 12.04 VPS for this guide. The duplicity package in the default repositories is outdated, and actually suffers from some problems with connecting to remote hosts due to a change in the backend.

We will avoid these problems by getting the source files and installing manually.

Log into the Ubuntu 12.04 VPS that you will be backing up, as root.

Install the Prerequisite Packages

Although we are installing duplicity from source, we will get the prerequisites from the default Ubuntu repositories.

Update the source database and then install the needed packages with these two commands:

apt-get update
apt-get install ncftp python-paramiko python-pycryptopp lftp python-boto python-dev librsync-dev

This installs a number of different handlers for transferring the data to the remote computer. We won't be using most of these within this guide, but they are good options to have.

Download and Install Duplicity from Source

The duplicity source files are housed at at launchpad.net. We will download them to the root user's home directory.

cd /root
wget http://code.launchpad.net/duplicity/0.6-series/0.6.22/+download/duplicity-0.6.22.tar.gz

Unpack the source and move into the package directory that is created:

tar xzvf duplicity*
cd duplicity*

Next, we will complete the actual installation with the following command:

python setup.py install

Because this is a package installed from source, it will be placed in the /usr/local/bin/ directory.

Create SSH and GPG Keys

Our configuration of duplicity will use two different kinds of keys to achieve a nice intersection between convenience and security.

We will use SSH keys to securely authenticate with the remote system without having to provide a password. We will also use GPG to encrypt the data before we transfer it to the backup location.

Create SSH Keys

We will generate an RSA encrypted SSH key for our root user to allow password-less logins to the machine that will host the backups.

If you have not done so already, make sure you have a root password configured on the machine you will be transferring the data to. You can do this by logging into the machine as root (through SSH or the Console Access button on the droplets page if this is a VPS) and issuing this command:

passwd

Back in the droplet with duplicity, we will generate a key pair with the following command:

ssh-keygen -t rsa

Press Enter at the prompts to create a password-less SSH key with the default settings.

Transfer it to the system that will host your backups with this command:

ssh-copy-id root@backupHost

Answer yes to accept the unverified host, and then enter the root password of the remote system to transfer your public key.

Test that you can now log in without a password from your duplicity droplet by issuing:

ssh root@backupHost

You should be logged in without having to provide any further credentials.

While you are logged in through SSH, create the directory structure that will house our backup files:

mkdir -p /remotebackup/duplicityDroplet

You can name the directory anything you'd like, but remember the value so that you can specify it later.

When you are finished, exit back out into your duplicity droplet:

exit

Create GPG Keys

We will be using GPG for extra security and encryption. The commands will store our keys in a hidden directory at /root/.gnupg/:

gpg --gen-key

You will be asked a series of questions that will configure the parameters of the key pair.

Please select what kind of key you want:
   (1) RSA and RSA (default)
   (2) DSA and Elgamal
   (3) DSA (sign only)
   (4) RSA (sign only)
Your selection? 
RSA keys may be between 1024 and 4096 bits long.
What keysize do you want? (2048) 
Requested keysize is 2048 bits
Please specify how long the key should be valid.
         0 = key does not expire
      <n>  = key expires in n days
      <n>w = key expires in n weeks
      <n>m = key expires in n months
      <n>y = key expires in n years
Key is valid for? (0) 
Key does not expire at all
Is this correct? (y/N) y

Press enter to accept the default "RSA and RSA" keys. Press enter twice again to accept the default keysize and no expiration date.

Type y to confirm your parameters.

You need a user ID to identify your key; the software constructs the user ID
from the Real Name, Comment and Email Address in this form:
    "Heinrich Heine (Der Dichter) <heinrichh@duesseldorf.de>"

Real name: Your Name
Email address: your_email@example.com
Comment: 
You selected this USER-ID:
    "Your Name <your_email@example.com>"

Change (N)ame, (C)omment, (E)mail or (O)kay/(Q)uit? o

Enter the name, email address, and optionally, a comment that will be associated with this key. Type O to confirm your information.

Next, you will be setting up a passphrase to use with GPG. Unlike with the SSH keys, where we defaulted to no passphrase allow duplicity to operate in the background, you should supply a passphrase for this step to allow secure encryption and decryption of your data.

Enter passphrase:
Repeat passphrase:

At this point, you will be asked to generate entropy. Entropy is basically a word that describes how much unpredictability is in a system. Your VPS needs entropy to create a key that is actually random.

We need to generate a lot of random bytes. It is a good idea to perform
some other action (type on the keyboard, move the mouse, utilize the
disks) during the prime generation; this gives the random number
generator a better chance to gain enough entropy.

Not enough random bytes available.  Please do some other work to give
the OS a chance to collect more entropy! (Need 280 more bytes)

If you need some help creating entropy, there is a guide on using Haveged to generate entropy here. In my experience, just installing some packages from apt is enough to generate the entropy needed. SSH in with a new terminal to do this.

When you've generated enough random pieces of information, your key will be created:

gpg: /root/.gnupg/trustdb.gpg: trustdb created
gpg: key 05AB3DF5 marked as ultimately trusted
public and secret key created and signed.

gpg: checking the trustdb
gpg: 3 marginal(s) needed, 1 complete(s) needed, PGP trust model
gpg: depth: 0  valid:   1  signed:   0  trust: 0-, 0q, 0n, 0m, 0f, 1u
pub   2048R/05AB3DF5 2013-09-19
      Key fingerprint = AF21 2669 07F7 ADDE 4ECF  2A33 A57F 6998 05AB 3DF5
uid                  Your Name 
sub   2048R/32866E3B 2013-09-19

The part highlighted above is your public key ID. You will need this later to encrypt the data you will be transferring.

If you forget to write down your public key ID, you can get it again by querying the gpg keyring:

gpg --list-keys
/root/.gnupg/pubring.gpg
------------------------
pub   2048R/05AB3DF5 2013-09-19
uid                  Your Name <your_email@example.com>
sub   2048R/32866E3B 2013-09-19

We now have all of the necessary components in place to securely backup using duplicity.

How To Use Duplicity

Run an Initial Test

We will run an initial test of our duplicity system by creating a folder of dummy files to back up. Run the following commands:

cd ~
mkdir test
touch test/file{1..100}

This creates a directory called test in the root home directory. It then fills the directory with files numbered 1-100.

We will move the files to the remote server, first without the GPG key we generated. We will use "sftp", which is a secure protocol included with SSH that replicates the functionality of ftp.

duplicity /root/test sftp://root@backupHost//remotebackup/duplicityDroplet

Notice the double slashes between the remote host and the file path. This is because we are specifying an absolute path. If it was a relative path from the default directory that sftp puts us in, we could use only one slash.

You will be asked to accept the remote host and then asked to create and confirm a key to use to encrypt the data. As you can see, GPG will still be used unless we specifically tell it not to. The only difference is that we are not using the keys we created, we could type in any password here.

Import of duplicity.backends.dpbxbackend Failed: No module named dropbox
The authenticity of host '162.243.2.14' can't be established.
SSH-RSA key fingerprint is 1f:4b:ae:1c:43:91:aa:2b:04:5b:a4:8e:cd:ea:e6:60.
Are you sure you want to continue connecting (yes/no)? yes
Local and Remote metadata are synchronized, no sync needed.
Last full backup date: none
GnuPG passphrase: 
Retype passphrase to confirm:

The backup then runs and you will be presented with statistics when the process completes:

No signatures found, switching to full backup.
--------------[ Backup Statistics ]--------------
StartTime 1379614581.49 (Thu Sep 19 18:16:21 2013)
EndTime 1379614581.60 (Thu Sep 19 18:16:21 2013)
ElapsedTime 0.11 (0.11 seconds)
SourceFiles 101
SourceFileSize 4096 (4.00 KB)
NewFiles 101
NewFileSize 4096 (4.00 KB)
DeletedFiles 0
ChangedFiles 0
ChangedFileSize 0 (0 bytes)
ChangedDeltaSize 0 (0 bytes)
DeltaEntries 101
RawDeltaSize 0 (0 bytes)
TotalDestinationSizeChange 1022 (1022 bytes)
Errors 0
-------------------------------------------------

If we SSH into our remote system, we can see that the backups completed successfully:

ssh root@backupHost
cd /remotebackup/duplicityDroplet
ls
duplicity-full.20130919T181705Z.manifest.gpg
duplicity-full.20130919T181705Z.vol1.difftar.gpg
duplicity-full-signatures.20130919T181705Z.sigtar.gpg

These files contain the backup information. Since this was just a test, we can delete them by running:

rm duplicity*

Exit back into the duplicity droplet:

exit

We can now remove the test directory and all of its contents:

rm -r /root/test

Create Your First Backup

We will create our first real backup by using the following general syntax:

duplicity --encrypt-key key_from_GPG --exclude files_to_exclude --include files_to_include path_to_back_up sftp://root@backupHost//remotebackup/duplicityDroplet

We will back up our entire root directory, with the exception of /proc/sys, and /tmp. We will use the GPG key we created. We do this by specifying the ID within the command, and preceding the command with the passphrase:

PASSPHRASE="passphrase_for_GPG" duplicity --encrypt-key 05AB3DF5 --exclude /proc --exclude /sys --exclude /tmp / sftp://root@backupHost//remotebackup/duplicityDroplet

The command above will take some time. Because this is the first time we've run the backup, duplicity will create a full back up. Duplicity divides the chunks of data into volumes to simplify the file transfers.

--------------[ Backup Statistics ]--------------
StartTime 1379621305.09 (Thu Sep 19 20:08:25 2013)
EndTime 1379621490.47 (Thu Sep 19 20:11:30 2013)
ElapsedTime 185.38 (3 minutes 5.38 seconds)
SourceFiles 33123
SourceFileSize 813465245 (776 MB)
NewFiles 33123
NewFileSize 813464221 (776 MB)
DeletedFiles 0
ChangedFiles 0
ChangedFileSize 0 (0 bytes)
ChangedDeltaSize 0 (0 bytes)
DeltaEntries 33123
RawDeltaSize 802133584 (765 MB)
TotalDestinationSizeChange 369163424 (352 MB)
Errors 0
-------------------------------------------------

On a fresh droplet, my configuration created 15 volumes which were transfered to the remote system.

Because we now have a full backup on the remote system, our next backup will automatically be an incremental backup. These are faster and require less data transfer. My first run took over three minutes, while my incremental backup took less than eight seconds.

--------------[ Backup Statistics ]--------------
StartTime 1379621776.23 (Thu Sep 19 20:16:16 2013)
EndTime 1379621783.80 (Thu Sep 19 20:16:23 2013)
ElapsedTime 7.57 (7.57 seconds)
SourceFiles 33128
SourceFileSize 820560987 (783 MB)
NewFiles 11
NewFileSize 12217723 (11.7 MB)
DeletedFiles 3
ChangedFiles 1
ChangedFileSize 600 (600 bytes)
ChangedDeltaSize 0 (0 bytes)
DeltaEntries 15
RawDeltaSize 12197851 (11.6 MB)
TotalDestinationSizeChange 12201207 (11.6 MB)
Errors 0
-------------------------------------------------

To force another full backup, you can add the "full" command to the duplicity call prior to any options:

PASSPHRASE="passphrase_for_GPG" duplicity full --encrypt-key 05AB3DF5 --exclude /proc --exclude /sys --exclude /tmp / sftp://root@backupHost//remotebackup/duplicityDroplet

Restore a Backup

Duplicity makes restoring easy. You can restore by simply reversing the remote and local parameters.

We don't need the encrypt-key option since we are only decrypting data. We also don't need the exclude parameters because they aren't included in the backup in the first place.

For instance, if we wanted to restore the data we just backed up, in its entirety, we could use this command:

PASSPHRASE="passphrase_for_GPG" duplicity sftp://root@backupHost//remotebackup/duplicityDroplet /

Perhaps a safer option is only restoring the files or directories that you need. You can do this by adding an option to the above command:

PASSPHRASE="passphrase_for_GPG" duplicity --file-to-restore /path/to/file sftp://root@backupHost//remotebackup/duplicityDroplet /path/to/restore/file

Make sure you test your ability to restore correctly, so that you do not run into problems when you are in a dire situation.

Automate Backups

We can automate duplicity by creating a few cron jobs. Click here to learn more about how to configure cron.

Create a Passphrase File

We will create a protected file to store our GPG passphrase so that we do not have to put it directly in our automation script.

Go to the root user's home directory and create a new hidden file with your text editor:

cd /root
nano .passphrase

The only thing we will need to put in this file is the passphrase specification that you have been preceding your duplicity commands with:

PASSPHRASE="passphrase_for_GPG"

Save and close the file.

Make it only readable by root by executing:

chmod 700 /root/.passphrase

Set Up Daily Incremental Backups

We will set up duplicity to create daily incremental backups. This will keep our backups up-to-date.

Scripts listed in /etc/cron.daily are run once a day, so this is the perfect place to create our backup script.

Navigate to that folder and create a file called duplicity.inc:

cd /etc/cron.daily
nano duplicity.inc

Copy the following bash script into the file. Replace the duplicity command with the command you would like to use to backup your system.

#!/bin/sh

test -x $(which duplicity) || exit 0
. /root/.passphrase

export PASSPHRASE
$(which duplicity) --encrypt-key 05AB3DF5 --exclude /proc --exclude /sys --exclude /tmp / sftp://root@backupHost//remotebackup/duplicityDroplet

Save and close the file.

Make it executable by typing the following command:

chmod 755 duplicity.inc

Test it by calling it:

./duplicity.inc

It should complete without any errors.

Set Up Weekly Full Backups

Incremental backups build off of full backups. This means that they will get increasingly unwieldy as changes stack up. We will configure weekly full backups to refresh the base.

We will do this by creating a similar script within the /etc/cron.weekly directory.

Navigate to the directory and create a new file:

cd /etc/cron.weekly
nano duplicity.full

Copy the following bash script into the file. Notice that we included the "full" command to force duplicity to run a full backup.

#!/bin/sh

test -x $(which duplicity) || exit 0
. /root/.passphrase

export PASSPHRASE
$(which duplicity) full --encrypt-key 05AB3DF5 --exclude /proc --exclude /sys --exclude /tmp / sftp://root@backupHost//remotebackup/duplicityDroplet

We are also going to add an additional duplicity command on the end to clean out old backup files. We will keep a total of three full backups and their associated incremental backups.

Add this to the end of the file

$(which duplicity) remove-all-but-n-full 3 --force sftp://root@backupHost//remotebackup/duplicityDroplet

Save and close the file.

Make it executable with the following command:

chmod 755 duplicity.full

Test it by calling:

./duplicity.full

It should do a full backup and then remove any files necessary.

Conclusion

You should now have a fully operational, automated backup solution in place. Be sure to regularly validate your backups in order to not fall victim to a false sense of security.

There are many other backup tools available, but duplicity is a flexible, simple solution that will fulfill many users' needs.

By Justin Ellingwood

 Subscribe

 Share

Introducing Projects on DigitalOcean

Organize your resources according to how you work.

READ MORE

Related Tutorials

34 Comments

  •  
  • B
  • I
  • UL
  • OL
  • Link
  • Code
  • Highlight
  • Table

Log In to Comment

  • dansku September 26, 2013

    Great article! Thanks for sharing!

     

      • phildobbin September 30, 2013

        running 'python setup.py install' resulted in 'duplicity/_librsyncmodule.c:26:22: fatal error: librsync.h: No such file or directory compilation terminated. error: command 'x86_64-linux-gnu-gcc' failed with exit status 1' Even after compiling librsync I got: '/usr/local/lib/librsync.a: could not read symbols: Bad value collect2: error: ld returned 1 exit status error: command 'x86_64-linux-gnu-gcc' failed with exit status 1'

         

          • jellingwood MOD September 30, 2013

            Hi Phil. Are you running this on a fresh Ubuntu 12.04 droplet? I'm not having much luck replicating the errors you're getting. Can you try looking in the README file from the duplicity folder and checking that your system has all of the requirements listed there? Post back here with any further information and we can try to sort this out.

             

              • phildobbin September 30, 2013

                I'm feeling pretty foolish at the moment: I left out: 'apt-get install ncftp python-paramiko python-pycryptopp lftp python-boto python-dev librsync-dev' All works fine now. Sorry for the noise…

                 

                  • jellingwood MOD October 2, 2013

                    Haha, no worries. I'm glad you got it sorted out!

                     

                      • zoot October 10, 2013

                        Best article ever about a secure duplicity automation !

                         

                          • mikaldalsbo February 27, 2014

                            I had some trouble with duplicity requesting a lockfile. I fixed it by "apt-get install python-lockfile"

                             

                              • bla March 1, 2014

                                I've applied this guide on Debian, and had trouble getting the cronjobs to execute. The reason was the filenames of the scripts: they should not include periods (.). From man run-parts: "If neither the –lsbsysinit option nor the –regex option is given then the names must consist entirely of ASCII upper- and lower-case letters, ASCII digits, ASCII underscores, and ASCII minus-hyphens." Hence renaming duplicity.inc and duplicity.full to duplicity-inc and duplicity-full, respectively, made it work. It seems to me that this should be the same way in Ubuntu.

                                 

                                  • isaak59 October 11, 2014

                                    I had the same problem ! I came up with the same solution, indeed.

                                    Have a nice day, bla !

                                     

                                    • digitalocean531567 March 7, 2014

                                      Nice article. But rather than compiling the new version from source, id suggest you just add the PPA. sudo apt-get install python-software-properties sudo apt-add-repository ppa:duplicity-team/ppa sudo apt-get update sudo apt-get install duplicity

                                       

                                        • isaak59 April 1, 2014

                                          Great tutorial ! It works like a charm on Debian 7 (Wheezey). You could add that to you title to get more visitors. By the way, for anyone who would like to use the latest duplicity source code (duplicity-0.6.23), that you can find here : https://launchpad.net/duplicity/0.6-series/0.6.23/+download/duplicity-0.6.23.tar.gz You'll have to install python-lockfile from the debian repository (must be the same for ubuntu, I guess) : apt-get install python-lockfile Have a nice day ! memento.

                                           

                                            • gmrafal April 28, 2014

                                              The second (weekly) cron job isn't really needed, you can add the "–full-if-older-than 1W" option to your daily job instead.

                                               

                                                • diti May 9, 2014

                                                  Am I the only one here who considers that making backup with symmetric encryption (i.e. password-based, with the cleartext password stored in a script) insecure? OpenPGP was made for asymmetric encryption (i.e. key-based). Which would be perfect for secure backups (safety-wise), since the only person able to decrypt your backups would be you (and not the server). So, anyone knows if this is possible, with Duplicity or another tool?

                                                   

                                                    • asb MOD May 9, 2014

                                                      @Dimitri: Storing the password in plaintext is basically unavoidable. What is the threat you are guarding against? The password being used here is for the key doing the encryption on the server being backed up. If someone has gained access to the server that has the password, they already have access to all of the files unencrypted. You can asymmetrically encrypt the backup to a different public key by passing a different key id to the "–encrypt-key" flag. Then someone who has compromised the key on the server wouldn't be able to access the backup files. Though it seems like you might have bigger problems at that point! You could cache the password with gpg-agent, but that doesn't survive a reboot so it makes it hard to script an automatic backup.

                                                       

                                                        • michaudg June 17, 2014

                                                          Thanks for this useful article. There is a little mistake at the beginning of Automated backups section : "chmod 700 ./passphrase" should be "chmod 700 ./.passphrase".

                                                           

                                                            • asb MOD June 17, 2014

                                                              @michaudg: Thanks for catching that! Updated.

                                                               

                                                                • javier762655 August 8, 2014

                                                                  Hi Andrew,

                                                                  If somenone would take control of your server (as root for example) it would be easy for him to get into your BKUP server as now there is no password to ssh into it… I can't think of a solution for that.

                                                                  Have you got any ideas regarding this issue? Thanks a lot in advance.

                                                                  Great tutorial by the way.
                                                                  eNe

                                                                   

                                                                    • jellingwood MOD August 8, 2014

                                                                      Hi javier,

                                                                      One solution to this issue would be to set up a new, completely locked down user on your backup server. You could assign it ownership of the necessary directories, and files, but not let it do anything else other than that. This way, if your first server is compromised, the most that they could do is get to a very limited account on your backup server.

                                                                       

                                                                        • javier762655 August 8, 2014

                                                                          Hi Justin,

                                                                          But this user would still be the one with access to all the backups.

                                                                          I see the point of your idea but the cracker could delete your backed up files if he wants to, isn't it?

                                                                          Thanks indeed for your insights.
                                                                          eNe

                                                                           

                                                                            • kamaln7 MOD August 8, 2014

                                                                              @javier: You can generate your key (the Create GPG Keys section) on your local computer (please make sure you back it up safely on a flash drive or something similar).

                                                                              Once you've done that, you export the public key by running:

                                                                              gpg --armor --export your-key-id > your-key-id.asc
                                                                              

                                                                              (The key ID in the tutorial's example is 05AB3DF5).

                                                                              Copy the file to your droplet when Duplicity is going to run, and then import it:

                                                                              gpg --import your-key-id.asc
                                                                              

                                                                              That's it. All you have to do now is configure Duplicity to use that key.

                                                                              duplicity --encrypt-key your-key-id ...
                                                                              

                                                                              Keep in mind that you will not be able to restore your files using Duplicity unless you move the private key to your server temporarily and then delete it.

                                                                              EDIT: it looks like your original question was deleted. I hope this helps 🙂

                                                                               

                                                                                • phil772018 August 16, 2014

                                                                                  Kamal Nasser, thanks so much. You detail a method of exporting the public key so you can encrypt from a different PC than where you created the keypair. I think I kinda need it the other way around. If my PC gets stolen or destroyed (which is the reason I am backing up), then I would need a copy of the private key for decryption right? So isnt it really really important, and not really covered in the article, that one should backup the private key somewhere safe. Or is it possible to decrypt the files on another PC using just the GPG passphrase?

                                                                                  Thanks