8

I'd like my Python script to read some data out of a postgresql dump file. The Python will be running on a system without postgresql, and needs to process the data in a dump file.

It looks fairly straightforward to parse the CREATE TABLE calls to find the column names, then the INSERT INTO rows to build the contents. But I'm sure there would be quite a few gotchas in doing this reliably. Does anyone know of a module which will do this?

6
  • Why do you want to read a PG dump? What is your use case? Commented Oct 28, 2013 at 15:22
  • I've said my use case - I want to read some postgres data, I need to do it on a system that doesn't have postgres, but I do have access to a database dump. Commented Oct 28, 2013 at 15:40
  • providing the input file format would be helpful (I believe postgresql has several ways you can export the data). Any chance of getting the data via csv? Commented Oct 28, 2013 at 15:47
  • @cmd Interesting idea. Unfortunately the dump is provided to me by a third party. It's a pretty standard SQL affair though, I was hoping someone else had already written a parser for this. Commented Oct 28, 2013 at 16:26
  • 1
    beware that there are several formats available for a pg_dump command (pure SQL, SQL made for pgsql, compressed format, using inserts, or not, etc). Seems to me that writing a full SQL parser is a very huge work, seems really wrong. Commented Oct 28, 2013 at 17:51

2 Answers 2

6

So ... 7 years later, there is now a Python package written specifically for this:

pip install pgdumplib

From the PyPI page, pgdumplib is a "Python3 library for reading and writing pg_dump files using the custom format."

Sign up to request clarification or add additional context in comments.

2 Comments

this should rather be a comment and not an answer. Because it merely gives a suggestion and not an implementation.
Thanks, I shall have to try that and compare speed! Now ... is there one for mysql? ;)
1

Thanks for all the comments, even if they are mostly "don't do this!" ;)

Given:

  1. The dump is always produced in the same format from a 3rd-party system
  2. I need to be able to automate reading it on another 3rd-party system without postgres

I've gone for writing my own basic parser, which is doing a good enough job for what I require.

3 Comments

looking for the same thing. Would you mind sharing a link to your script?
@kthouz, sorry what I do have is very specific to my particular requirements and I'm not able to share it.
@xorsyst could you share at least the source where you got the informations from how to parse the .dump file into a file which is readable?

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.