pandas to_sql() with NUMERIC data type

Question

I loaded a Pandas DataFrame by reading from a file and doing some pre-processing - that has a few columns of numbers. such as

            value
1     13654654328.4567895
2     NULL
3     54643215587.6875455

In order not to lose accuracy I plan to store it as NUMERIC in SQL Server. Since I do not want Pandas to convert my data into float, I load it as string and then use df.to_sql() to insert into SQL.

It worked fine if no NULL exists. However if it contains null, no matter I put "" or np.nan for null, it reported the error as "Error converting data type nvarchar to numeric." Seems that it automatically converts it into empty string which could not be cast into NUMERIC in SQL Server.

Is there any way that I could handle this problem. Hopefully done everything in Python and no further SQL script is needed.

Would there be any issues caused by simply updating the NULL values to 0 instead of ""? If not, then I would do that and then you would be able to import them without issue. You could also then update the 0 values to be NULL — Edward
– Edward, Commented Sep 11, 2018 at 21:26
Because 0 already exists in the raw data, and in this situation 0 is different from null. Setting a 'flag' might not be safe since the raw data could be whatever number. — MTANG
– MTANG, Commented Sep 11, 2018 at 21:28
pandas should be able to handle both nan and null when you use to_sql() with numeric type. Is it possible that you store “NULL” str in your dataframe? That would explain the error message. — Andrea Nagy
– Andrea Nagy, Commented Sep 11, 2018 at 21:35
Your other option then would be to replace the instances of NULL with something that won't appear in the actual data. Perhaps -1 would work. — Edward
– Edward, Commented Sep 12, 2018 at 16:25

Finrod Felagund · Accepted Answer · 2018-09-11 22:09:03Z

2

I haven't used .to_sql method ever, but I would suppose that you need to replace your NULL values with None values. For example:

df.replace([np.nan], [None], inplace=True)

Btw np.nan is type float. In SQL, NULL is type "nothing", equivalent of this in Python is None. Also "" and "NULL" would be considered as string.

edited Sep 11, 2018 at 22:09

answered Sep 11, 2018 at 21:28

Finrod Felagund

1,3032 gold badges14 silver badges20 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

Alexander McFarlane · Accepted Answer · 2018-09-11 21:34:52Z

0

I thought I'd add more detail to complement to answer below ...

As per PEP 249 -- Python Database API Specification v2.0

SQL NULL values are represented by the Python None singleton on input and output.

You are having the problem because you are trying to send mixed types to the db. You need to replace all intended blank values with None

References

answered Sep 11, 2018 at 21:34

Alexander McFarlane

11.4k10 gold badges63 silver badges104 bronze badges

Collectives™ on Stack Overflow

pandas to_sql() with NUMERIC data type

2 Answers 2

Comments

References

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Comments

References

Comments

Your Answer

Sign up or log in

Post as a guest

Related