Copying Python data types to Postgres

Question

I feel like this is a very basic question, but I cannot find a comprehensive answer on this. I am wondering what is best practice for copying python data types, like lists or dictionaries, to a postgres database (using psycopg2).

Assuming I create or have a table that I want to fill up with data from a previous calculation, like

data = [(n, random.randint(0, 100)) for n in range(0, 100)]

As far as I know, the standard way to copy the data would be something like

curs.executemany("""
  INSERT INTO my_table (id, rand_int)
  VALUES (%s, %s)""", data)

I am guessing this will loop through the list and be rather slow. Is there a smarter or more efficient way?

UPDATE:

In the meantime I found this answer, suggesting to use something like:

args_str = ','.join(cur.mogrify("(%s,%s)", x) for x in data)
cur.execute("INSERT INTO table VALUES " + args_str)

And Craig suggests in his answer here to use copy_from. So my question has changed a little:

What is the most efficient method and how could it be implemented?

I'm voting to close this question as off-topic because this is a Python question and, as such, belongs on StackOverflow - recommend migration. — Vérace
– Vérace, Commented Jun 21, 2015 at 21:29
@Vérace while I agree with being OT here, question about improvement of a working solution are not overly welcome on SO. — bummi
– bummi, Commented Jun 21, 2015 at 21:56
I've only posted a few times on SO (mostly to give a bounty for a problem I was having). I'm surprised though - surely optimisation is a significant part of programming? AFAIC, questions about optimising poorly performing SQL are on-topic here. — Vérace
– Vérace, Commented Jun 21, 2015 at 21:59
oh. I was under the impression that all database related questions should go here (after seeing people complaining on SO). Will move the question to SO once I am back on my computer. thanks anyway. — n1000
– n1000, Commented Jun 21, 2015 at 22:53

Community · Accepted Answer · 2017-05-23 12:05:55Z

2

For big data sets, use COPY, via psycopg2's copy_from function.

http://initd.org/psycopg/docs/cursor.html#cursor.copy_from

See also how to speed up insertion performance in PostgreSQL.

edited May 23, 2017 at 12:05

CommunityBot

11 silver badge

answered Jun 21, 2015 at 23:34

Craig Ringer

329k83 gold badges742 silver badges820 bronze badges

Sign up to request clarification or add additional context in comments.

4 Comments

n1000 Over a year ago

Thank you! copy_from is actually the function that got me thinking in the first place. However, I am not sure how to make it work with data types. Would I need to pickle it first or something? May I ask you to give an example how this could look like? For instance based on my dummy case above.

Craig Ringer Over a year ago

You need to form a CSV row per tuple, where each field is a PostgreSQL-input-compatible representation of a value of that column's type. The Python csv module is very useful for creating the stream of rows. Most values are trivial to store - integers, floating point values, text, etc you just store as-is. Dates you just format as YYYY-MM-DD. Only a few things, like timestamps, take any real thought about how to represent them in the CSV input.

n1000 Over a year ago

so something like data_csv = [str(i) for i in data] and then cur.copy_from(data_csv, 'my_table', columns=('id', 'rand_int'))? Would you mind if I edited your answer?

Craig Ringer Over a year ago

No, because the data_csv is a list. You need a file-like object where each data row is a comma-separated string, terminated with a newline. Use the Python csv module. I don't have time to write a canned example right now, but I have the feeling I might've written one in the past; search around for examples using psycopg2's copy_from

Collectives™ on Stack Overflow

Copying Python data types to Postgres

1 Answer 1

4 Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

4 Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related