I am trying to read a large table (10-15M rows) from a database into pandas dataframe and I'm using the following code:
def read_sql_tmpfile(query, db_engine):
with tempfile.TemporaryFile() as tmpfile:
copy_sql = "COPY ({query}) TO STDOUT WITH CSV {head}".format(
query=query, head="HEADER"
)
conn = db_engine.raw_connection()
cur = conn.cursor()
cur.copy_expert(copy_sql, tmpfile)
tmpfile.seek(0)
df = pandas.read_csv(tmpfile)
return df
I can use this if I have a simple query like this and I pass this into above func:
'''SELECT * from hourly_data'''
But what if I want to pass some variable into this query i.e.
'''SELECT * from hourly_data where starttime >= %s '''
Now where do I pass the parameter?