can not convert column type from object to str in python dataframe

Question

i have downloaded a csv file, and then read it to python dataframe, now all 4 columns all have object type, i want to convert them to str type,

and now the result of dtypes is as follows:

Name                      object
Position Title            object
Department                object
Employee Annual Salary    object
dtype: object

i try to change the type using the following methods:

path['Employee Annual Salary'] = path['Employee Annual Salary'].astype(str)

but dtypes still return type object, and i also try to provide the column type when reading csv,

path = pd.read_csv("C:\\Users\\IBM_ADMIN\\Desktop\\ml-1m\\city-of-chicago-salaries.csv",dtype={'Employee Annual Salary':str})

or

path = pd.read_csv("C:\\Users\\IBM_ADMIN\\Desktop\\ml-1m\\city-of-chicago-salaries.csv",dtype=str)

but still do not work, want to know how to change column type from object to str,

Possible duplicate of stackoverflow.com/questions/21018654/… — meatballs
– meatballs, Commented Dec 14, 2016 at 13:47
that link is helpful for me, then another problem is: how to remove that '$' from column Employee Annual Salary, and then convert that to float type ? — tonyibm
– tonyibm, Commented Dec 15, 2016 at 1:11
i found the reason why it failed to use replace, the correct way is : path['Employee Annual Salary'] = path['Employee Annual Salary'].str.replace('$',''), i didn't add str in front of replace in the past, — tonyibm
– tonyibm, Commented Dec 15, 2016 at 1:19

Dharman · Accepted Answer · 2020-11-05 10:13:48Z

29

Actually you can set the type of a column to string. Use .astype('string') rather than .astype(str).

Sample Data Set

df = pd.DataFrame(data={'name': ['Bla',None,'Peter']})

The column name is by default a object.

Single Column Solution

df.name = df.name.astype('string')

It's important to write .astype('string') rather than .astype(str) which didn't work for me. It will stay as object as you do so.

Multi-Column Solution

df = df.astype(dtype={'name': 'string'})

Allows to change multiple fields at once.

edited Nov 5, 2020 at 10:13

Dharman♦

33.9k27 gold badges105 silver badges157 bronze badges

answered Nov 5, 2020 at 10:07

Felix

1,2971 gold badge13 silver badges17 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

kjsr7 Over a year ago

When I use .astype('string'), I get this error -> TypeError: data type 'string' not understood pandas version -> 0.25.3

superjisan Over a year ago

This worked great .astype('str') worked for me, but I had a slightly different problem

meatballs · Accepted Answer · 2016-12-14 14:22:47Z

27

For strings, the column type will always be 'object.' There is no need for you convert anything; it is already doing what you require.

The types come from numpy, which has a set of numeric data types. Anything else is an object.

You might want to read http://nbviewer.jupyter.org/github/jakevdp/PythonDataScienceHandbook/blob/master/notebooks/02.01-Understanding-Data-Types.ipynb for a fuller explanation.

edited Dec 14, 2016 at 14:22

answered Dec 14, 2016 at 13:56

meatballs

3,9871 gold badge17 silver badges20 bronze badges

3 Comments

tonyibm Over a year ago

i try to remove '$' from column Employee Annual Salary, if i use replace directly, it do not work,

tonyibm Over a year ago

object is actually for str, so no need to convert it to str type,

Mathy Over a year ago

But then there may be an issue when trying to df.join ("ValueError: You are trying to merge on object and int64 columns.")

Abhijit · Accepted Answer · 2021-03-10 02:52:10Z

9

Please use:--

df = df.convert_dtypes()

It will automatically convert to suitable Types. and it whould work.

answered Mar 10, 2021 at 2:52

Abhijit

3634 silver badges7 bronze badges

1 Comment

Mathy Over a year ago

What a nice thing to know...

aquil.abdullah · Accepted Answer · 2016-12-14 13:56:50Z

2

I think that the astype worked, it's just that you can't see the results of the changes viewing dtypes. For example,

import pandas
data = [{'Name': 'Schmoe, Joe', 'Position Title': 'Dude', 'Department': 'Zip', 'Employee Annual Salary': 200000.00},
        {'Name': 'Schmoe, Jill', 'Position Title': 'Dudette', 'Department': 'Zam', 'Employee Annual Salary': 300000.00},
        {'Name': 'Schmoe, John', 'Position Title': 'The Man', 'Department': 'Piz', 'Employee Annual Salary': 100000.00},
        {'Name': 'Schmoe, Julie', 'Position Title': 'The Woman', 'Department': 'Maz', 'Employee Annual Salary': 150000.00}]
df = pandas.DataFrame.from_records(data, columns=['Name', 'Position Title', 'Department', 'Employee Annual Salary'] )

Now if I do dtypes on df I see:

In [32]: df.dtypes
Out[32]:
Name                       object
Position Title             object
Department                 object
Employee Annual Salary    float64
dtype: object

Now if I do,

In [33]: df.astype(str)['Employee Annual Salary'].map(lambda x:  type(x))
Out[33]:
0    <type 'str'>
1    <type 'str'>
2    <type 'str'>
3    <type 'str'>
Name: Employee Annual Salary, dtype: object

I see that all of my salary values are now floats even though the dtype shows up as a column.

So the bottom line is that I think that you are fine.

answered Dec 14, 2016 at 13:56

aquil.abdullah

3,1673 gold badges24 silver badges40 bronze badges

2 Comments

tonyibm Over a year ago

the column Employee Annual Salary has '$', i want to remove it, after i use replace, it do not work,

tonyibm Over a year ago

object is actually for str, so no need to convert it to str using astype,

DataBach · Accepted Answer · 2020-02-01 15:16:50Z

0

I agree with the above mentioned answers. You do not need to convert objects to string. However, if you ever have the need to convert a multitude of columns to another datatype (ex. int) you can use the following code:

object_columns_list = list(df.select_dtypes(include='object').columns)

for object_column in object_columns_list:
    df[object_column] = df[object_column].astype(int)

answered Feb 1, 2020 at 15:16

DataBach

1,6753 gold badges22 silver badges47 bronze badges

Collectives™ on Stack Overflow

can not convert column type from object to str in python dataframe

5 Answers 5

2 Comments

3 Comments

1 Comment

2 Comments

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

5 Answers 5

2 Comments

3 Comments

1 Comment

2 Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related