Backing Up A Local Postgres Database to AWS S3

Since I decided to move my Heroku-hosted Postgres database to work locally, I wanted to make sure I’m backing it up regularly, in case something should happen to my computer. To make this as simple as possible, I created a script to do this on demand with one command.

Resources

The following posts & pages were useful references while putting this together:

How to Backup PostgreSQL Database to Amazon S3 with Bash - a step-by-step that you could use on its own, but I made some changes
GitHub gist: Database backup to S3 with pg_dump and aws-cli - a more robust version of what I wanted, but with some unnecessary aspects
Heroku docs on import/export - useful because I’d already used part of this and it was reliable
pg_dump documentation - to understand what the command I’m using actually does
gzip documentation - to understand the compression command
aws-cli s3 documentation - a reference for syntax & storage classes

The bash Function

I saved the script as a function since I plan to run it manually. This could be set up as a cron job instead, but since I’m pretty irregular when it comes to adding data to this database, it’s simple enough for me to just back it up when I need to. With this script, I can manually run dbs3 for the backup to take place.

function dbs3 {
    today=`date +%Y-%m-%d.%H.%M`
    filename="db-backup-$today.dump"
    pg_dump -Fc --no-acl --no-owner -h localhost -U user dbname > $filename
    gzip --best $filename
    aws s3 mv --storage-class GLACIER $filename.gz "s3://backup-bucket"
}

To break it down…

`today=\`date +%Y-%m-%d.%H.%M``

Sets a variable for today’s date & time.

`filename="db-backup-$today.dump"`

Sets a variable for what the backup filename will be.

`pg_dump -Fc --no-acl --no-owner -h localhost -U user dbname > $filename`

pg_dump is a backup utility that comes installed with Postgres. -Fc formats the archive to be suitable for input into pg_restore in the most flexible way. --no-acl and --no-owner remove all access privileges, meaning the data will be accessible in a new database where my local database user/owner do not exist. -h localhost is where the database is being backed up from. -U user is the database user that has access to perform operations on this database (set up when the db was created). dbname is the name of the database that’s to be backed up. > $filename tells pg_dump to create the backup as the filename, which was set with the variables above.

This operation will create the backup file in whatever the present working directory is at the moment. You could change this if you want, but since I’m removing the local file later on, I don’t bother with this.

`gzip --best $filename`

Compresses the backup file into a smaller format. This probably isn’t necessary since the -Fc command in pg_dump means the dump is already compressed, but I did notice a difference of a few kb between the zipped and not-zipped files, so I kept this step in.

The --best flag is has the slowest compression time & highest compression…fine for these tiny files.

`aws s3 mv --storage-class GLACIER $filename.gz "s3://backup-bucket"`

I already have the aws-cli set up and authorized in my terminal, so could skip a lot of steps in the tutorials above.

I chose to mv the file instead of cp so that I don’t have to take an extra step of deleting the backup file afterwards. I also chose to use a cheaper storage class since I don’t really anticipate ever needing this 🤞

Note the added .gz to the $filename variable…this is the extension that gzip will add to the filename.

Wrap Up / What’s Missing

And that’s it! Job done, now I just run that simple command every time I close down the app, which is also run locally from the command line. Easy peasy.

One thing that’s missing that I might do in the future, is add to the script so that it deletes any backups older than n days at the same time as backing up the new file. I don’t really need these to pile up indefinitely, so it makes sense. Might do that in the future.

I could also theoretically automate this to back itself up at regular intervals so it’s not a manual process. I will implement this if I decide to stop using the command line to run the app, but for now it really doesn’t make a difference.