24

I get following error while trying to convert object (string) column in Pandas to Int32 which is integer type that allows for NA values.

df.column = df.column.astype('Int32')

TypeError: object cannot be converted to an IntegerDtype

I'm using pandas version: 0.25.3

4 Answers 4

42

It's known bug, as explained here.

Workaround is to convert column first to float and than to Int32.

Make sure you strip your column from whitespaces before you do conversion:

df.column = df.column.str.strip()

Than do conversion:

df.column = df.column.astype('float')  # first convert to float before int
df.column = df.column.astype('Int32')

or simpler:

 df.column = df.column.astype('float').astype('Int32') # or Int64
Sign up to request clarification or add additional context in comments.

4 Comments

Shocked this bug still exists a year later.
Unfortunately, this bug still exists as of today, though it sufficiently annoyed me I decided to submit a PR <github.com/pandas-dev/pandas/pull/43949>. Hopefully, that will get through soon, as all tests seem to be passing now.
The fix looks slated for 1.4
This solution gets the next error: TypeError: cannot safely cast non-equivalent float64 to int64 The next solution worked out fine: df.column = df.column.apply(lambda x: float(x)).apply(lambda x: int(x))
3

As of v0.24, you can use: df['col'] = df['col'].astype(pd.Int32Dtype())

Edit: I should have mentioned that this falls under the Nullable integer documentation. The docs specify other nullable integer types as well (i.e. Int64Dtype, Int8Dtype, UInt64Dtype, etc.)

2 Comments

Worked for me by first converting to float then using this
This doesn't work. You have to convert to float first then you can convert to pd.Int32Dtype() or pd.Int53Dtype() or "Int32" or "Int64"
0

Personally, I use df = df.astype({i: type_dict[i] for i in header}, errors='ignore') to deal with this problem. Note that attribute errors is to ignore all kinds of warnings. Though it is very inelegant and possible to cause other critical bugs, it does work in converting np.NAN or string of int like `100` or int like 100 to pandas.Int. Hope this could help you.

1 Comment

Haolin, your proposed method returns NameError: name 'header' is not defined
0

Best way to do this is as follows:

df['col'] = df['col'].apply(pd.to_numeric,errors='coerce').astype(pd.Int32Dtype())

So it will first convert any invalid integer value to NaN first & then to NA

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.