44

I have a pandas data frame with different data types. I want to convert more than one column in the data frame to string type. I have individually done for each column but want to know if there is an efficient way?

So at present I am doing something like this:

repair['SCENARIO']=repair['SCENARIO'].astype(str)

repair['SERVICE_TYPE']= repair['SERVICE_TYPE'].astype(str)

I want a function that would help me pass multiple columns and convert them to strings.

3 Answers 3

81

To convert multiple columns to string, include a list of columns to your above-mentioned command:

df[['one', 'two', 'three']] = df[['one', 'two', 'three']].astype(str)
# add as many column names as you like.

That means that one way to convert all columns is to construct the list of columns like this:

all_columns = list(df) # Creates list of all column headers
df[all_columns] = df[all_columns].astype(str)

Note that the latter can also be done directly (see comments).

Sign up to request clarification or add additional context in comments.

6 Comments

For all columns, how about df = df.astype(str) ?
Yes, also works, absolutely - I just posted this solution to stick with the concept of lists
Thanks sudonym... I was actually looking for something like a function that would take columns in a data frame and convert them to string. I should be able to change the column names as required though the first solution works perfectly and I did implement it.
Is there any performance difference between the two? I tried df = df.astype(str) shape (50000, 23000) and it crashed (in interactive mode). Thank you
Wondering why this doesn't works if the list of columns has a single element...
|
25

I know this is an old question, but I was looking for a way to turn all columns with an object dtype to strings as a workaround for a bug I discovered in rpy2. I'm working with large dataframes, so didn't want to list each column explicitly. This seemed to work well for me so I thought I'd share in case it helps someone else.

stringcols = df.select_dtypes(include='object').columns
df[stringcols] = df[stringcols].fillna('').astype(str)

The "fillna('')" prevents NaN entries from getting converted to the string 'nan' by replacing with an empty string instead.

Comments

0

You can also use list comprehension:

df = [df[col_name].astype(str) for col_name in df.columns]

You can also insert a condition to test if the columns should be converted - for example:

df = [df[col_name].astype(str) for col_name in df.columns if 'to_str' in col_name]

1 Comment

These solutions overwrite your pandas dataframe with a list

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.