1

I have a dataframe that looks like

  ID_0 ID_1  ID_2
0    a    b  0.05
1    a    b  0.10
2    a    b  0.19
3    a    c  0.25
4    a    c  0.40
5    a    c  0.65
6    a    c  0.71
7    d    c  0.95
8    d    c  1.00

I want to groupby and make a normalized histogram of the ID_2 column for each group. So I do

df.groupby(['ID_0', 'ID_1']).apply(lambda x: np.histogram(x['ID_2'], range = (0,1), density=True)[0]).reset_index(name='ID_2')

However what I would really like is for the 11 elements of the numpy arrays to be in separate columns of the dataframe.

How can I do this?

1 Answer 1

3

You can construct a series object from each numpy array and the elements will be broadcasted as columns:

import pandas as pd
import numpy as np
df.groupby(['ID_0', 'ID_1']).apply(lambda x: pd.Series(np.histogram(x['ID_2'], range = (0,1), density=True)[0])).reset_index()

enter image description here

Sign up to request clarification or add additional context in comments.

1 Comment

That's clever. Thank you!

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.