3

I have a pandas dataframe df whose elements are each a whole numpy array. For example the 6th row of column 'x_grid':

>>> e = df.loc[6,'x_grid']
>>> print(e)

[-11.52616579 -11.48006112 -11.43395646 -11.3878518  -11.34174713
 -11.29564247 -11.24953781 -11.20343315 -11.15732848 -11.11122382
 -11.06511916 -11.01901449 ...

But I cannot use this as a numpy array as it is just given as a string:

>>> print(type(e))

<class 'str'>

How can I store a numpy array to a dataframe so it does not get converted to a string? Or convert this string back to a numpy array in a nice way?

4
  • It's worth noting that this DataFrame is loaded from a csv file, which is no doubt where the conversion to string happens. So I guess converting this string back to a numpy array would be the easier route. Commented Apr 4, 2019 at 15:50
  • 1
    Plus, there are no commas to seperate the elements in your array. Commented Apr 4, 2019 at 15:52
  • 1
    Look at the source text file. This array is a quoted string, complete with[], Are there also ...? The original dataframe had these array items, and the only way to save such a df to a 2d csv format is turn the complex items into strings. pandas used str(item). Where possible avoid saving such dataframes as csv. Commented Apr 4, 2019 at 15:57
  • 1
    This has come up a number times, e.g. stackoverflow.com/questions/51898099/…. literal_eval might have problems with your string because it is missing the commas that normally mark a list. Commented Apr 4, 2019 at 16:42

3 Answers 3

0

If you just want to convert all those strings in each row into a list the following will work:

df['x_grid'].str[1:-1].str.split(" ").apply(lambda x: (list(map(float, x))))

# or for a numpy array
df['x_grid'].str[1:-1].str.split(" ").apply(lambda x: (np.array(list(map(float, x)))))

Hope that helps.

Sign up to request clarification or add additional context in comments.

Comments

0

Thanks to Erfan and hpaulj for the suggestions that combined to answer this question.

The solution is that when setting an element of the dataframe I first convert the numpy array x to a list (so it is comma separated not space separated):

df = df.append({'x_grid': list(x)}, ignore_index=True)

Then after saving to a csv, and loading back in, I extract it back into a numpy array using np.array() and ast.literal_eval() (Note: requires import ast):

x = np.array(ast.literal_eval(df.loc[entry,'x_grid']))

This then returns a correct numpy array x.

Comments

0

Want to extend Rafal's answer to avoid numpy throwing exception from empty strings resulting from the x.split:

df['x_grid'].str[1:-1].apply(lambda x: list(filter(None,x.split(' ')))).apply(lambda x: np.array(x).astype(np.float))

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.