What is the best way to avoid this error?
DataError: invalid input syntax for integer: "669068424.0" CONTEXT: COPY sequence_raw, line 2, column id: "669068424.0"
I created a table using pgadmin which specified the data type for each column. I then read the data in with pandas and do some processing. I could explicitly provide a list of columns and say that they are .astype(int), but is that necessary?
I understand that the reason that there is a .0 after the integers is because there are NaNs in the data so they are turned into floats instead of integers. What is the best way to work around this? I saw on the pre-release of pandas 0.19 that there is better handling of sparse data, is this covered by any chance?
def process_file(conn, table_name, file_object):
fake_conn = pg_engine.raw_connection()
fake_cur = fake_conn.cursor()
fake_cur.copy_expert(sql=to_sql % table_name, file=file_object)
fake_conn.commit()
fake_cur.close()
df = pd.read_sql_query(sql=query.format(**params), con=engine)
df.to_csv('../raw/temp_sequence.csv', index=False)
df = open('../raw/temp_sequence.csv')
process_file(conn=pg_engine, table_name='sequence_raw', file_object=df)
floatcolumn but you want to export it to csv as an int column? Is that what you're asking?intcolumn, but when you try to export it to a CSV you somehow get afloatcolumn back?float_formatargument forto_csvto remove the decimal places?