Syntax Error while inserting bulk data from pandas df to postgresql

Question

I am trying to bulk insert pandas dataframe data into Postgresql. In Pandas dataframe I have 35 columns and in Postgresql table I have 45 columns. I am choosing 12 matching column from pandas dataframe and inserting into postgresql table. For this I am using the following code snippets:

df = pd.read_excel(raw_file_path,sheet_name = 'Sheet1',usecols=col_names) <---col_names = list of desired columns (12 columns)
cols = ','.join(list(df.columns))
tuples = [tuple(x) for x in df.to_numpy()]
query = "INSERT INTO {0}.{1} ({2}) VALUES (%%s,%%s,%%s,%%s,%%s,%%s,%%s,%%s,%%s,%%s,%%s,%%s);".format(schema_name,table_name,cols)
curr = conn.cursor()
try:
    curr.executemany(query,tuples)
    conn.commit()
    curr.close()
except (Exception, psycopg2.DatabaseError) as error:
    print("Error: %s" % error)
    conn.rollback()
    curr.close()
    return 1
finally:
    if conn is not None:
        conn.close()
        print('Database connection closed.')

When running I am getting this error:

SyntaxError: syntax error at or near "%"
LINE 1: ...it,purchase_group,indenter_name,wbs_code) VALUES (%s,%s,%s,%...

Even if I use ? in place of %%s I am still getting this error.

Can anybody throw some light on this?

P.S. I am using Postgresql version 10.

%s would take a string value from a python variable. What do you want to do with those %s things? You already put variables in the string with {0} etc. Do you want to pass on %s or put some value there? — antont
– antont, Commented Sep 13, 2020 at 3:29
@antont: I want to pass on %s i.e. row wise values as tuples into the db. Objective is to bulk insert. Or even if I use the query string like "INSERT INTO {0}.{1} ({2})".format(schema_name,table_name,cols) + "VALUES(?,?,...?)" then also I am getting the same error. — pythondumb
– pythondumb, Commented Sep 13, 2020 at 3:32
Well put {3} {4} etc if you want more parameters, I think better not to mix two syntaxes on one line — antont
– antont, Commented Sep 13, 2020 at 3:54

Ruben Helsloot · Accepted Answer · 2020-09-13 09:43:41Z

1

What you're doing now is actually insert a pandas dataframe one row at a time. Even if this worked, it would be an extremely slow operation. At the same time, if the data might contain strings, just placing them into a query string like this leaves you open to SQL injection.

I wouldn't reinvent the wheel. Pandas has a to_sql function that takes a dataframe and converts it into a query for you. You can specify what to do on conflict (when a row already exists).

It works with SQLAlchemy, which has excellent support for PostgreSQL. And even though it might be a new package to explore and install, you're not required to use it anywhere else to make this work.

from sqlalchemy import create_engine
engine = create_engine('postgresql://localhost:5432/mydatabase')

pd.read_excel(
    raw_file_path,
    sheet_name = 'Sheet1',
    usecols=col_names  # <---col_names = list of desired columns (12 columns)
).to_sql(
    schema=schema_name,
    name=table_name,
    con=engine,
    method='multi'  # this makes it do all inserts in one go
)

answered Sep 13, 2020 at 9:43

Ruben Helsloot

13.2k6 gold badges33 silver badges56 bronze badges

Sign up to request clarification or add additional context in comments.

6 Comments

pythondumb Over a year ago

This is good. However, I was looking for more traditional and generic approach.

Ruben Helsloot Over a year ago

Can you explain what you mean by "traditional"? I think this is about as generic as it gets

pythondumb Over a year ago

where is actual insertion happening?

Ruben Helsloot Over a year ago

Inside the function, it's handled by pandas. See the documentation I linked for an example: they call the function, then SELECT and see that the results have been inserted

pythondumb Over a year ago

Thanks Ruben! this worked. Just a bit curious. How to close a connection when using create_engine?

|

Collectives™ on Stack Overflow

Syntax Error while inserting bulk data from pandas df to postgresql

1 Answer 1

6 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

6 Comments

Your Answer

Sign up or log in

Post as a guest

Related