Saving a pandas dataframe to separate jsons without NaNs

Question

I have a dataframe with some NaN values.

Here is a sample dataframe:

sample_df = pd.DataFrame([[1,np.nan,1],[2,2,np.nan], [np.nan, 3, 3], [4,4,4],[np.nan,np.nan,5], [6,np.nan,np.nan]])

It looks like:

What I did after to get a json:

sample_df.to_json(orient = 'records')

Which gives:

'[{"0":1.0,"1":null,"2":1.0},{"0":2.0,"1":2.0,"2":null},{"0":null,"1":3.0,"2":3.0},{"0":4.0,"1":4.0,"2":4.0},{"0":null,"1":null,"2":5.0},{"0":6.0,"1":null,"2":null}]'

I want to save this dataframe to a json with 2 rows in each json, but with none of the Nan values. Here is how I tried to do it:

df_dict = dict((n, sample_df.iloc[n:n+2, :]) for n in range(0, len(sample_df), 2))

for k, v in df_dict.items():
    print(k)
    print(v)
    for d in (v.to_dict('record')):
        for k,v in list(d.items()):
            if type(v)==float:
                if math.isnan(v):
                    del d[k]

json.dumps(df_dict)

Output I want:

'[{"0":1.0,"2":1.0},{"0":2.0,"1":2.0}]' -> in one .json file '[{"1":3.0,"2":3.0},{"0":4.0,"1":4.0,"2":4.0}]' -> in second .json file '[{"2":5.0},{"0":6.0}]' -> in third .json file

@cᴏʟᴅsᴘᴇᴇᴅ Added! Sorry for not giving enough detail. — pr338
– pr338, Commented Sep 13, 2017 at 23:59

cs95 · Accepted Answer · 2017-09-14 00:22:45Z

1

Use apply to drop NaNs, groupby to group and dfGroupBy.apply to JSONify.

s = sample_df.apply(lambda x: x.dropna().to_dict(), 1)\
        .groupby(sample_df.index // 2)\
        .apply(lambda x: x.to_json(orient='records'))
s    

0            [{"0":1.0,"2":1.0},{"0":2.0,"1":2.0}]
1    [{"1":3.0,"2":3.0},{"0":4.0,"1":4.0,"2":4.0}]
2                            [{"2":5.0},{"0":6.0}]
dtype: object

Finally, iterate over .values and save to separate JSON files.

import json
for i, j_data in enumerate(s.values):
    json.dump(j_data, open('File{}.json'.format(i + 1), 'w'))

answered Sep 14, 2017 at 0:22

cs95

406k106 gold badges744 silver badges797 bronze badges

Sign up to request clarification or add additional context in comments.

4 Comments

pr338 Over a year ago

What if I changed the original dataframe's index to be a column with strings in the data and I wanted the same output? I get the error TypeError: cannot perform floordiv with this index type: <class 'pandas.core.indexes.base.Index'>.

cs95 Over a year ago

@pr338 Use np.arange(df.shape[0]) // 2

pr338 Over a year ago

Sorry, I wasn't clear. I meant the output with the index being a string like "indexhere" [{"fund.numeric.returnY3CategoryRank":0,"fund.... Going to edit original question with another example if this is still not clear.

cs95 Over a year ago

@pr338 Ah, sorry... things are getting jumbled up. Can you ask a new question?

G. Cohen · Accepted Answer · 2021-12-28 19:13:51Z

0

I suggest:

with open("data.json","w") as fpout:
    fpout.write("{\n")
    for row_id in range(sample_df.shape[0]):
        fpout.write("\t" + str(sample_df.index[row_id]) + ":" + sample_df.iloc[row_id].dropna().to_json(orient="index") + "\n")
    fpout.write("}\n")

edited Dec 28, 2021 at 19:13

answered Dec 28, 2021 at 16:50

G. Cohen

6206 silver badges4 bronze badges

Collectives™ on Stack Overflow

Saving a pandas dataframe to separate jsons without NaNs

2 Answers 2

4 Comments

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

4 Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related