0

Colleauge of mine wrote for me python function to import data from files on server to postgresql db.

  1. His "primary" python function is writen in python script on disk, and is using psycopg2.
  2. This "primary" function is called from wrapper plpythonu function, defined with python 2 on the pg server.

Basically it uses "COPY" to load data into my existing table.

When I call the wrapper plpythonu function from my query window (pgAdmin), it finishes almost instantly (small test data), and loads the data correctly. But when I try to call it from within my other function (plpgsql), it just hangs.

When looking at pg_stat_activity, the COPY command is displayed there with state = 'active' and waiting = 't'. Inside the python function, we tried set_isolation_level to autocommit, and we tried explicit commit as well. Also we tried a simple hardcoded INSERT instead of COPY, but it behaves the same way.

Is there something we are not aware of, or forgetting ? I tried searching for similar topics here and on google, to no avail so far.

Edit - added code, some parts are left out for obvious reasons. First is the "primary" python script:

import os
import subprocess
import psycopg2
import psycopg2.extensions

_DB_HOST = # left out
_DB_NAME = # left out
_DB_USER = # left out
_DB_PWD  = # left out

#-----------------------------------------------------------------------------------------------------------------------
def _copyFile(cur, filePath):
    # it was tried with "COPY ... , for testing I use a simpler hardcoded insert
    cmd = "insert into etl.tmp_import_file(file_line) values('test insert from python ' || clock_timestamp());"
    # table etl.tmp_import_file exists, column file_line is varchar(100), and it works when the plpythonu function is called by itself
    cur.execute(cmd)

#-----------------------------------------------------------------------------------------------------------------------
def save_file(fileId):
    dbc = psycopg2.connect(database=_DB_NAME, user=_DB_USER, host=_DB_HOST, password=_DB_PWD)
    # dbc.set_isolation_level(psycopg2.extensions.ISOLATION_LEVEL_AUTOCOMMIT) # didn't work either
    cur = dbc.cursor()

    _copyFile(cur, filePath)
    dbc.commit()
    dbc.close()

#-----------------------------------------------------------------------------------------------------------------------
if __name__ == "__main__":
    import sys
    if len(sys.argv) < 2:
        print 'Usage: save_file.py <file_id>'
        sys.exit(1)
    else:
        save_file(sys.argv[1])
        sys.exit(0)

And this is the "wrapper" plpythonu function in pg server:

CREATE OR REPLACE FUNCTION etl.save_file(file_id integer)
  RETURNS integer AS
$BODY$

import sys
sys.path.append("/path_to_the_python_script")

import test_save_file as c

try:
    c.save_file(file_id)
    return 0
except:
    return -1

$BODY$
  LANGUAGE plpythonu VOLATILE
  COST 100;
ALTER FUNCTION etl.save_file(integer)
  OWNER TO postgres;
3
  • Show the relevant code Commented Dec 1, 2015 at 13:03
  • Code samples added, sorry for delay. Calling it directly: select etl.save_file(1); works just fine. But when I use it inside other plpgsql function, it hangs, as described in original post. Commented Dec 1, 2015 at 14:01
  • Since we had to find a way for this to work, we ended up creating a plpgsql function, which takes the file path as parameter, and executes dynamic sql with COPY of that file. The python function then calls this function, instead of trying to directly COPY into db. Although this seems as extra unnecessary step, it worked for us, unlike the posted version above. Still, if anyone can point out what was wrong, it will help us in future, and maybe other people as well. Commented Dec 1, 2015 at 16:16

0

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.