0

Consider the following code in Python, using psycopg2 cursor object (Some column names were changed or omitted for clarity):

filename='data.csv'
file_columns=('id', 'node_id', 'segment_id', 'elevated', 
              'approximation', 'the_geom', 'azimuth')
self._cur.copy_from(file=open(filename),
                    table=self.new_table_name, columns=file_columns)
  • The database is located on a remote machine on a fast LAN.
  • Using \COPY from bash works very fast, even for large (~1,000,000 lines) files.

This code is ultra-fast for 5,000 lines, but when data.csv grows beyond 10,000 lines, the program freezes completely.

Any thoughts \ solutions?

Adam

2 Answers 2

5

This is just a workaround, but you can just pipe something into psql. I use this recipe sometimes when I am too lazy to bust out psycopg2

import subprocess
def psql_copy_from(filename, tablename, columns = None):
    """Warning, this does not properly quote things"""
    coltxt = ' (%s)' % ', '.join(columns) if columns else ''
    with open(filename) as f:
        subprocess.check_call([
            'psql',
            '-c', 'COPY %s%s FROM STDIN' % (tablename, coltxt),
            '--set=ON_ERROR_STOP=true', # to be safe
            # add your connection args here
        ], stdin=f)

As far as your locking up is concerned, are you using multiple threads or anything like that?

Is your postgres logging anything such as a closed connection or a deadlock? Can you see disk activity after it locks up?

Sign up to request clarification or add additional context in comments.

1 Comment

+1 Thanks, that's the solution I've used, with a slightly different syntax. I don't see any signs of deadlocks, and disk activity is normal (it's quite a busy server).
1

That's memory limitation which makes "copy_from" crashing as open(filename) returns all the file in one shot. It's a psycopg2's problem, not Postgresql one, so Mike's solution is the best one.

There is a solution if you want to use "copy_from" with regular commits and manage duplicate keys at the same time: https://stackoverflow.com/a/11059350/1431079

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.