Python turning ints into floats (Postgres database)

Question

What is the best way to avoid this error?

DataError: invalid input syntax for integer: "669068424.0" CONTEXT: COPY sequence_raw, line 2, column id: "669068424.0"

I created a table using pgadmin which specified the data type for each column. I then read the data in with pandas and do some processing. I could explicitly provide a list of columns and say that they are .astype(int), but is that necessary?

I understand that the reason that there is a .0 after the integers is because there are NaNs in the data so they are turned into floats instead of integers. What is the best way to work around this? I saw on the pre-release of pandas 0.19 that there is better handling of sparse data, is this covered by any chance?

def process_file(conn, table_name, file_object):
    fake_conn = pg_engine.raw_connection()
    fake_cur = fake_conn.cursor()
    fake_cur.copy_expert(sql=to_sql % table_name, file=file_object)
    fake_conn.commit()
    fake_cur.close()


df = pd.read_sql_query(sql=query.format(**params), con=engine)
df.to_csv('../raw/temp_sequence.csv', index=False)
df = open('../raw/temp_sequence.csv')
process_file(conn=pg_engine, table_name='sequence_raw', file_object=df)

So you have a table with a float column but you want to export it to csv as an int column? Is that what you're asking? — univerio
– univerio, Commented Sep 13, 2016 at 17:00
They are all ints (number of seconds). However, there are rows with NULLs. Python or pandas makes those columns into floats because it doesn't handle NaN integers. I need to fillna with 0 in order for the column to be recognized as an integer (this seems like a waste, I get about 2 million rows per day and a lot of the rows have blanks). — trench
– trench, Commented Sep 13, 2016 at 17:03
It's still quite unclear what your exact situation is. Let me see if I understand correctly. You created a table manually with an int column, but when you try to export it to a CSV you somehow get a float column back? — univerio
– univerio, Commented Sep 13, 2016 at 18:15
Yes, if an integer column has a blank in it then that column is converted to float64. pandas.pydata.org/pandas-docs/stable/gotchas.html. I am trying to find the most efficient workaround. Do I fill the blanks with 0 and then explicitly convert to int? Do I change the columns in Postgres to numeric instead? Is there a better way? — trench
– trench, Commented Sep 13, 2016 at 18:20
I see; it's the round-trip to CSV that mangles the data. Have you tried specifying float_format argument for to_csv to remove the decimal places? — univerio
– univerio, Commented Sep 13, 2016 at 18:32

univerio · Accepted Answer · 2016-09-13 20:13:12Z

2

You can use the float_format parameter for to_csv to specify the format of the floats in the CSV:

df.to_csv('../raw/temp_sequence.csv', index=False, float_format="%d")

answered Sep 13, 2016 at 20:13

univerio

20.7k3 gold badges73 silver badges73 bronze badges

Sign up to request clarification or add additional context in comments.

Collectives™ on Stack Overflow

Python turning ints into floats (Postgres database)

1 Answer 1

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Related