1

I am working on extracting JSON array column in dataframe in Python using Pandas Library, where I have a data like this

>df

id      partnerid   payments

5263    org1244     [{"sNo": 1, "amount":"1000"}, {"sNo": 2, "amount":"500"}]
5264    org1245     [{"sNo": 1, "amount":"2000"}, {"sNo": 2, "amount":"600"}]
5265    org1246     [{"sNo": 1, "amount":"3000"}, {"sNo": 2, "amount":"700"}]

I want to extract the JSON data inside the list and add it as a column in same dataframe like this

>mod_df

id      partnerid   sNo amount

5263    org1244     1   1000
5263    org1244     2   500
5264    org1245     1   2000
5264    org1245     2   600
5265    org1246     1   3000
5265    org1246     2   700

I have tried with this approach

import pandas as pd
import json as j

df = pd.read_parquet('sample.parquet')

js_loads = df['payments'].apply(j.loads)
js_list = list(js_loads)
j_data = j.dumps(js_list)
df = df.join(pd.read_json(j_data))
df = df.drop(columns=['payments'] , axis=1)

But this works, only if we have JSON data in column not list of JSON. Can someone explain, how can I achieve my desired output?

1 Answer 1

1

Convert it to list by ast.literal_eval and use explode()to transform each element to a row and also replicate the other columns.

Then, use .apply(pd.Series) to convert dict-like to series.

Finally, concatenate to original dataframe using pd.concat().


Example:

import ast

# sample data
d = {'col1': [0, 1, 2], 'payments': ['[{"sNo": 1, "amount":"1000"}, {"sNo": 2, "amount":"500"}]', '[{"sNo": 1, "amount":"2000"}, {"sNo": 2, "amount":"600"}]', '[{"sNo": 1, "amount":"3000"}, {"sNo": 2, "amount":"700"}]']}
df = pd.DataFrame(data=d, index=[0, 1, 2])

df['payments'] = df['payments'].apply(ast.literal_eval)
df = df.explode('payments') 
out = pd.concat([df.drop(['payments'], axis=1), df['payments'].apply(pd.Series)], axis=1).reset_index(drop=True)

output:

  col1 sNo amount
0 0   1   1000
1 0   2   500
2 1   1   2000
3 1   2   600
4 2   1   3000
5 2   2   700
Sign up to request clarification or add additional context in comments.

2 Comments

ast.literal_eval is telling me malformed node or string: [{'val1': 2.5, 'val2': 0.0}]
if [{'val1': 2.5, 'val2': 0.0}] is the string representation of list, it should work.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.