14

I am confused by the type conversion in python pandas

df = pd.DataFrame({'a':['1.23', '0.123']})
type(df['a'])
df['a'].astype(float)

Here df is a pandas series and its contents are 2 strings, then I can apply astype(float) on this pandas series, and it correctly convert all string into float. However

df['a'][1].astype(float)

gives me AttributeError: 'str' object has no attribute 'astype'. My question is: how can that be? I could convert the whole series from string to float but I couldn't convert the entry of this series from string to float?

Also, I load my raw data set

df['id'].astype(int)

it generates ValueError: invalid literal for int() with base 10: '' This one seems to suggest that there is a blank in my df['id']. So I check whether it is true by typing

'' in df['id']

it says false. So I am very confused.

1
  • You have to use like this df['a'].iloc[1].astype(float), it will not throw error Commented Jul 27, 2018 at 4:54

4 Answers 4

14

df['a'] returns a Series object that has astype as a vectorized way to convert all elements in the series into another one.

df['a'][1] returns the content of one cell of the dataframe, in this case the string '0.123'. This is now returning a str object that doesn't have this function. To convert it use regular python instruction:

type(df['a'][1])
Out[25]: str

float(df['a'][1])
Out[26]: 0.123

type(float(df['a'][1]))
Out[27]: float

As per your second question, the operator in that is at the end calling __contains__ against the series with '' as argument, here is the docstring of the operator:

help(pd.Series.__contains__)
Help on function __contains__ in module pandas.core.generic:

__contains__(self, key)
    True if the key is in the info axis

It means that the in operator is searching your empty string in the index, not the contents of it.

The way to search your empty strings is to use the equal operator:

df
Out[54]: 
    a
0  42
1    

'' in df
Out[55]: False

df==''
Out[56]: 
       a
0  False
1   True

df[df['a']=='']
Out[57]: 
  a
1  
Sign up to request clarification or add additional context in comments.

2 Comments

thanks! I have a short followup question. So in your example df, if I want to check whether number 42 is in the df, I should not use 42 in df or 42 in df['a'] or 42 in df[['a']] right? the in is checking the index of a pandas series? but what about df[['a']]? it is a pandas dataframe. So in when operating on a dataframe is still checking the index?
Same mechanics for a dataframe. So do df==42
2

df['a'][1] will return the actual value inside the array, at the position 1, which is in fact a string. You can convert it by using float(df['a'][1]).

>>> df = pd.DataFrame({'a':['1.23', '0.123']})
>>> type(df['a'])
<class 'pandas.core.series.Series'>
>>> df['a'].astype(float)
0    1.230
1    0.123
Name: a, dtype: float64
>>> type(df['a'][1])
<type 'str'>

For the second question, maybe you have an empty value on your raw data. The correct test would be:

>>> df = pd.DataFrame({'a':['1', '']})
>>> '' in df['a'].values
True

Source for the second question: https://stackoverflow.com/a/21320011/5335508

Comments

1

In addition to the solutions already posted you could also simply use:

df['a'].astype(float)[1]

1 Comment

Your answer could be improved with additional supporting information. Please edit to add further details, such as citations or documentation, so that others can confirm that your answer is correct. You can find more information on how to write good answers in the help center.
-1
data1 = {'age': [1,1,2, np.nan],
        'gender': ['m', 'f', 'm', np.nan],
        'salary': [2,1,2, np.nan]}

x = pd.DataFrame(data1)
for i in list(x.columns):
    print(type((x[i].iloc[1])))
    if isinstance(x[i].iloc[1], str):
        print("It is String")
    else:
        print('Not a String')

1 Comment

Posting code without any explanation isn't welcome here. Please edit your post.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.