I currently have a prod and test database that live on 2 servers azure postgres servers. I want to do a nightly backup of the prod database onto test, such that every morning the two are identical. My datatables have contraints and keys, so I can't just copy over the data itself but also the schemas, so a simple pandas df.to_sql won't cover it.
My current plan is to run a nightly Azure Functions python script that does the copying over. I tried sqlalchemy but had significant issues copying over metadata correctly. Now I am trying to use postgres' pg_dump and pg_restore/psql commands via a subprocess with the following code:
def backup_database(location, database, password, username, backup_file):
# Use pg_dump command to create a backup of the specified database
cmd = [
'pg_dump',
'-Fc',
'-f', backup_file,
'-h', location,
'-d', database,
'-U', username,
'-p', '5432',
'-W',
]
subprocess.run(cmd, check=True, input=password.encode())
def clear_database(engine, metadata):
# Drop all tables in the database
metadata.drop_all(bind=engine, checkfirst=False)
def restore_database(location, database, password, username, backup_file):
# Use pg_restore command to restore the backup onto the database
# cmd = ['pg_restore', '-Fc', '-d', engine.url.database, backup_file]
cmd = [
'pg_restore',
'-Fc',
'-C',
'-f', backup_file,
'-h', location,
#'-d', database,
'-U', username,
'-p', '5432',
'-W',
]
try:
subprocess.run(cmd, check=True, capture_output=True, text=True)
print("Backup restored onto the test server.")
except subprocess.CalledProcessError as e:
print("Error occurred while restoring the backup:")
print(e.stdout) # Print the output from the command
print(e.stderr) # Print the error message, if available
# Define backup file path
backup_file = '/pathtofile/backup_file.dump' # Update with the desired backup file path
backup_file2 = 'backup_file.dump' # Update with the desired backup file path
# Backup the production database
backup_database(input_host, input_database, input_password, input_user, backup_file)
print("Backup of the production database created.")
# Create metadata object for test server
output_metadata = MetaData(bind=output_engine)
clear_database(output_engine, output_metadata)
print("Test server cleared.")
restore_database(output_host, output_datebase, output_password, output_user, backup_file2)
print("Backup restored onto the test server.")
This code appears to be creating a dump file, but it is not successfully restoring to the test database. If I get this code to work, how do I specify file paths within Azure Functions, is this a suitable solution to run from Azure Functions? If not, how to get sqlalchemy to successfully clear test data/metadata, then copy over data from prod every night?


