4

I want to periodically archive database dumps to my own AWS account in S3 and eventually in glacier. Is there a way to dump a postgresql database to the dynos filesystem from within the dyno (from where I then can send the file to AWS)? psql and pg_dump don't seem to be available on the dyno and I don't know how to run pgbackups from within a dyno.

4
  • I don't know the answer to your question, but if your ultimate goal is to get stuff on S3, check out PG Backups which will do it automatically. Commented Jun 4, 2014 at 10:05
  • Yes I know, but this will actually delete it from there after a few days, and I'd like to keep all dumps in glacier. Commented Jun 4, 2014 at 10:20
  • PGBackups is great, but it does delete everything after a month. We also need to store regular backups to Glacier forever, so an answer to this question would be awesome. Commented Jul 17, 2014 at 23:20
  • 'scuse my ignorance, but can you use SSH to create a tunnel and do the backup over that? Commented Jul 22, 2014 at 22:33

2 Answers 2

2

Create a separate heroku app to do the backups, which uses the pgbackups-archive gem and then set up a heroku scheduler to run the pgbackups-archive gem periodically on your DATABASE_URL (you will need to import that environment variable from your other app), as described here.

Disclaimer: this nominally requires you to use some ruby, but works in conjunction with any heroku cedar app using heroku postgres (including django apps).

Sign up to request clarification or add additional context in comments.

2 Comments

Thanks for the suggestion Ben. I was looking for a solution for Python / Django apps. I actually also received a message from Heroku support, where the claimed that they're working on a solution where you can feed pgbackups your own S3 Bucket, which would be great.
Yeah, we also use python / django for our app, but setting up another heroku app just to do the backups from our app, with the same DATABASE_URL as our app, using the method in the link, was easy cheese and the best we could find.
1

The best I could come up with now is using the pgbackups addon (which I had been using) and then daily pulling the latest backup from s3 and uploading it back up to my bucket. A PGBACKUPS_URL env variable is exposed by Heroku if this addon is enabled. The rest would go something like this:

    # boto and requests are required, aws access credentials are in the settings file 

    url = settings.PGBACKUPS_URL + "/latest_backup"
    dumpfile = "./db_dump"

    # get latest backup info
    r = requests.get(url)
    dump_url = r.json()["public_url"]
    dump_timestamp = r.json()["finished_at"].replace("/", "-")
    dump_name = "db_dumps/" + dump_timestamp

    # write dump to file
    r = requests.get(dump_url, stream=True)
    if r.status_code == 200:
        with open(dumpfile, 'wb') as f:
            for chunk in r.iter_content():
                f.write(chunk)

    conn = S3Connection(settings.AWS_ACCESS_KEY_ID, settings.AWS_SECRET_ACCESS_KEY)
    bucket = conn.get_bucket(settings.AWS_DB_DUMPS_BUCKET)
    key = bucket.new_key(dump_name)
    key.set_contents_from_filename(dumpfile)

I have yet to find out, wether a backup can be triggered somehow via the PGBACKUPS_URL.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.