29JunHow to backup your server to Amazon S3

Since a few months ago I’m taking care of three ubuntu servers with different services like svn, mysql, websites, etc. and for sure I want to provide a flexible, secure and cheap backup system to my customers.

I started playing around with s3cmd command line utility and I realize that is really powerful so, I decide to write my own scripts to be able to backup automatically all the important files of my servers to the cheaper and easy to use Amazon S3 service.

The whole system is based on two bash scripts, the crontab and a few extra applications like 7zip and s3cmd. I will explain step by step how I build up the system and at the end I’ll provide the full code of the scripts.

Requirements

The first step is to be sure that you have installed the 7zip and s3cmd tools, in my case I’m using ubuntu so, for me the command to install the tools is:

sudo apt-get install p7zip s3cmd

Basic setup

Now we can follow with the basic setup, I created a folder in the root of the filesystem called backups, the inside this folder I create serveral folders, at the end the structure is like follow:

/backups
    /compressed
    /data
        /db
        /svn
        /www
    /scripts

The compressed folder is used at the end of the script to compress all the files to one .7z file, in the data folder we will copy the information we want to backup and in the scripts folder we will store as it’s name says the two scripts we will use.

Backup script

Now, let’s have a look at those scripts. The first one is to do the backup, in my case I’m doing a backup of the subversion, mysql and files under /var/www.

The script starts with the following block:

echo `date '+%F %T'`: Starting the backup

#Dumping the repos
echo `date '+%F %T'`: Starting the dump of the repos
for i in /user/local/svn/*/; do
    repo=`basename $i`
    echo `date '+%F %T'`: Dumping repo $repo
    /usr/bin/svnadmin dump /usr/local/svn/$repo > /backups/data/svn/$repo.dump
    echo `date '+%F %T'`: Repo $repo dumped
done

We suppose that the svn folder is located at /usr/local/svn, basically here we are going through each folder inside /usr/local/svn/ and we call svnadmin dump to dump that repository in one file, that file is stored in /backups/data/svn/.

#Dumping the DB
echo `date '+%F %T'`: Starting the dump of the DataBases
for i in /var/lib/mysql/*/; do
    db=`basename $i`
    echo `date '+%F %T'`: Dumping DB $db
    /usr/bin/mysqldump -uroot -p$db_password $db > /backups/data/db/$db.sql
    echo `date '+%F %T'`: Database $db dumped
done

In this second block we are doing more or less the same thing, we are going through each folder on /var/lib/mysql/ and calling mysqldump to export each DB in one file stored at /backups/data/db/.

#Copy all the websites
echo `date '+%F %T'`: Starting the dump of websites
for i in /var/www/*/; do
    site=`basename $i`
    echo `date '+%F %T'`: Dumping site $site
    /usr/bin/7z a -mx6 -t7z /backups/data/www/$site.7z -p$compression_password /var/www/$site
    echo `date '+%F %T'`: Site $site dumped
done

In the third block we are dumping the websites, I have the websites stored at /var/www folder and each site is inside a folder with the domain as folder name. Basically we are doing the same, loop through /var/www/ and compressing each site individually in a .7z file with a compression level of 6 and secured with a password.
I store each site in a separate file because if I have to restore a site I don’t need to uncompress all the sites, just the main file and then the compressed site.
I was playing with different compression levels and 6 is the most suitable to maintain a good compression ratio without wasting a lot of time.

#Compressing all the data
echo `date '+%F %T'`: Compressing the info
filename=$(date +%Y%m%d)
/usr/bin/7z a -mx6 -t7z /backups/compressed/$filename.7z -p$password /backups/data/*
echo `date '+%F %T'`: Info compressed

Now with all the information we want to backup in place we will compress all together in one single file, that’s the purpose of this block of code, the final filename is a timestamp of today using the date command.

#Upload to Amazon S3
echo `date '+%F %T'`: Uploading to Amazon S3
/usr/bin/s3cmd put --no-progress /backups/compressed/$filename.7z s3://BUCKET_NAME/$filename.7z
echo `date '+%F %T'`: Upload completed

Ok, we have the compressed file and one of the final steps is upload this file to Amazon S3. For that task we are using the s3cmd tool passing the parameter –no-progress to avoid an “interactive” output of the upload status.

#Delete the local backups
echo `date '+%F %T'`: Cleaning up
rm -Rf /backups/data/svn/*
rm -Rf /backups/data/db/*
rm -Rf /backups/data/www/*
rm -Rf /backups/compressed/*
echo `date '+%F %T'`: Clean completed
echo `date '+%F %T'`: Backup completed

And the final block of code is just a clean up, after the upload to amazon we delete all the files we have generated inside the /backups folder.

Ok, seems complex but is pretty strightforward, also if you take a look you’ll see that all the messages contains a timestamp, with that information we are able to determine how many time we spend in each task.
One disadvantage of this script is that we are not checking if the file was uploaded correctly to Amazon, maybe this will be a future improvement.

Maintenance script

Now let’s take a look at the other script, the purpose is just delete old backups based on a timestamp, in my case we’re storing the backups for one month, it’s very affordable with the Amazon prices.

for filename in `s3cmd ls s3://$bucket`; do
    if [[ $filename =~ ([0-9]*)\.7z ]]; then
        timestamp=${BASH_REMATCH[1]}
        echo `date '+%F %T'` - Reading metadata of: $filename
        echo -e "\tFilename: $filename"
        echo -e "\tTimestamp: $timestamp"
        if [[ $timestamp -le $limit ]]; then
            let "total=total+1"
            echo -e "\tResult: Backup deleted\n"
            /usr/bin/s3cmd del $filename
        else
            echo -e "\tResult: Backup keeped\n"
        fi
    fi
done

This is the only code block we have in the maintenance script, the idea is to retrieve the list of files we have in the bucket with s3cmd, then we check if the gived part contains the pattern ([0-9]*)\.7z (because the command s3cmd ls gives more information rather than just the filenames).
If we detect a segment that matches the regex we get the filename without the extension (the timestamp of the backup), and we check in this case if the timestamp is older than one month (the timestamp of one month ago is stored in the $limit variable). If the backup is older we remove it calling the s3cmd del command.

If you need to store the files more time just change the $limit variable and that’s it.

Wow, seems a large process, ok, now let’s have a look at the last step to let this work, the cronjobs.
I created to cronjobs and redirected the output of them to the stdout because I want to receive an email after the backup, my crontab look like this:

# m h  dom mon dow   command
MAILTO=YOUR_EMAIL_ADDRESS_HERE
00 22 * * * /backups/scripts/s3backup 2>&1
00 23 * * * /backups/scripts/s3backup-maintenance 2>&1

Just put there the values you want to run the crons and don’t forget to put at the end 2>&1.

Conclusion

With this system you’ll have your backups in a safe and cheap place without headaches and with a full report in you email every time the script runs. For sure these scripts can be impoved with a few checks and you can extend it to fit your needs.
I hope you have enjoyed this HowTo and begin to use this scripts! For sure improvements are welcomed, don’t hesitate to comment!.

Source code

Grab the full source code of the backup script and the maintenance script.

5 Responses and Counting...

  • Tweets that mention How to backup your server to Amazon S3 | Segmentation Fault! — Topsy.com

    June 29th 2010

    [...] This post was mentioned on Twitter by S. C., Christopher Valles. Christopher Valles said: A new HowTo in SegmentationFault, how to backup your server to Amazon S3 using p7zip, s3cmd, etc. With full code! http://bit.ly/bfFGJ8 [...]

  • Jay G.

    Interesting backup method. Did you look into other S3-based backup services, like tarsnap?

    http://www.tarsnap.com/

  • Christopher Vallés

    Wow, semen pretty powerful! Thanks foro the info! I didn’t try any other system, I starter this scripts more like a test and at the end I finish with that. For me is working really nice because I have full control over all the process.

    I appreciate your comment and feedback ;)

  • Domain Name Improvement

    You can restore a backup even if the server has crashed and is unbootable. Domain Name Improvement

  • Bitacoras.com

    Información Bitacoras.com…

    Valora en Bitacoras.com: Since a few months ago I’m taking care of three ubuntu servers with different services like svn, mysql, websites, etc. and for sure I want to provide a flexible, secure and cheap backup system to my customers. I started pla……