I want to periodically archive database dumps to my own AWS account in S3 and eventually in glacier. Is there a way to dump a postgresql database to the dynos filesystem from within the dyno (from where I then can send the file to AWS)? psql and pg_dump don't seem to be available on the dyno and I don't know how to run pgbackups from within a dyno.
-
I don't know the answer to your question, but if your ultimate goal is to get stuff on S3, check out PG Backups which will do it automatically.Jon Mountjoy– Jon Mountjoy2014-06-04 10:05:27 +00:00Commented Jun 4, 2014 at 10:05
-
Yes I know, but this will actually delete it from there after a few days, and I'd like to keep all dumps in glacier.huesforalice– huesforalice2014-06-04 10:20:17 +00:00Commented Jun 4, 2014 at 10:20
-
PGBackups is great, but it does delete everything after a month. We also need to store regular backups to Glacier forever, so an answer to this question would be awesome.B Robster– B Robster2014-07-17 23:20:52 +00:00Commented Jul 17, 2014 at 23:20
-
'scuse my ignorance, but can you use SSH to create a tunnel and do the backup over that?Kirk Roybal– Kirk Roybal2014-07-22 22:33:30 +00:00Commented Jul 22, 2014 at 22:33
2 Answers
Create a separate heroku app to do the backups, which uses the pgbackups-archive gem and then set up a heroku scheduler to run the pgbackups-archive gem periodically on your DATABASE_URL (you will need to import that environment variable from your other app), as described here.
Disclaimer: this nominally requires you to use some ruby, but works in conjunction with any heroku cedar app using heroku postgres (including django apps).
2 Comments
The best I could come up with now is using the pgbackups addon (which I had been using) and then daily pulling the latest backup from s3 and uploading it back up to my bucket. A PGBACKUPS_URL env variable is exposed by Heroku if this addon is enabled. The rest would go something like this:
# boto and requests are required, aws access credentials are in the settings file
url = settings.PGBACKUPS_URL + "/latest_backup"
dumpfile = "./db_dump"
# get latest backup info
r = requests.get(url)
dump_url = r.json()["public_url"]
dump_timestamp = r.json()["finished_at"].replace("/", "-")
dump_name = "db_dumps/" + dump_timestamp
# write dump to file
r = requests.get(dump_url, stream=True)
if r.status_code == 200:
with open(dumpfile, 'wb') as f:
for chunk in r.iter_content():
f.write(chunk)
conn = S3Connection(settings.AWS_ACCESS_KEY_ID, settings.AWS_SECRET_ACCESS_KEY)
bucket = conn.get_bucket(settings.AWS_DB_DUMPS_BUCKET)
key = bucket.new_key(dump_name)
key.set_contents_from_filename(dumpfile)
I have yet to find out, wether a backup can be triggered somehow via the PGBACKUPS_URL.