Split into dataframes and plot for each in loop? [duplicate]

Ask Question

Asked 4 years, 3 months ago

Modified 4 years, 3 months ago

Viewed 174 times

This question already has answers here:

Pandas dataframe groupby plot (3 answers)

How to loop over grouped Pandas dataframe? (4 answers)

Closed 4 years ago.

The community reviewed whether to reopen this question 4 years ago and left it closed:

Duplicate This question has been answered, is not unique, and doesn’t differentiate itself from another question.

I have a data frame with few thousand rows. It looks something like below:

ID Amount Segment
1  23     A
2  43     B
3  65     A
4  23     A
5  86     C
6  54     B
7  432    B
8  987    A
9  43     C
10 46     C

At first I had to segregate data based on segment which I did:

df_A = df[(df['Segment'] == 'A')]
df_B = df[(df['Segment'] == 'B')]
df_C = df[(df['Segment'] == 'C')]

After doing so I had to perform some operations which included groupby and other functions. So I have to groupby in each of those subsets and perform operations for example as shown below:

df_A['days'] = (df_A['first'] - df_US['last']).dt.days
df_A_A = df_A[(df_A['days'] >= 0) & (df_A['days'] <= 30)]
A = df_A_A.groupby('days').user.nunique().reset_index()
A['user'] = A['user'].cumsum()

Now here I am creating two further data frames for each subset and finally plotting the dataframe A (B and C in other two subsets).

And in the end I had to plot for each set:

plt.plot(A['x'], A['y'], color='red', label='A')
plt.plot(B['x'], B['y'], color='blue', label='B')
plt.plot(C['x'], C['y'], color='green', label='C')

Now the problem is that I may have n number of segments and it would be easier to do all this operation inside one loop. Is that possible? I want to write the code for only one segment and then basically get the desired output for all the segments. I tried to group by segment in loop but not sure how to accommodate it so that it will also create df_A_A and A data frames in the process.

I am trying this and getting error:

for key, grp in df.groupby(['Segment']):




df_grp = df[(df['Segment'] == grp)]
df_grp['days'] = (df_grp['first'] - df_grp['last']).dt.days

df_grp_1 = df_grp[(df_grp['days'] >= 0) & (df_grp['days'] <= 30)]
grp = df_grp_1.groupby('days').user.nunique().reset_index()
grp['user'] = grp['user'].cumsum()


plt.plot(grp['days'], grp['conv'], color='key', label=key)
plt.legend()
plt.xlabel('days')
plt.ylabel('conv')
plt.show()

I am getting this error:

File "<ipython-input-6-7a4977f42a83>", line 10
    df_grp = df[(df['Segment'] == grp)]
         ^
IndentationError: expected an indented block

Thanks in advance!

edited Aug 24, 2021 at 17:12

asked Aug 24, 2021 at 3:11

Abhishek Singh

1872 silver badges14 bronze badges

you can groupby then plot check this stackoverflow.com/questions/41494942/…

Epsi95
– Epsi95

2021-08-24 03:14:23 +00:00
Commented Aug 24, 2021 at 3:14
@Epsi95 But I have other operations as well not just plot.

Abhishek Singh
– Abhishek Singh

2021-08-24 03:16:49 +00:00
Commented Aug 24, 2021 at 3:16
groupby is iterable as demonstrated in the second answer for key, grp in df.groupby('Segment'): Then in the loop body would be something like: plt.plot(grp['x'], grp['y'], label=key)

Henry Ecker
– Henry Ecker ♦

2021-08-24 03:17:24 +00:00
Commented Aug 24, 2021 at 3:17
@HenryEcker Yes I understood this but this does not solve my problem where I am doing some other operations. Maybe I will have to edit and include those pointers as well.

Abhishek Singh
– Abhishek Singh

2021-08-24 03:27:58 +00:00
Commented Aug 24, 2021 at 3:27
Yes. The way your question currently reads a loop over groupby addresses the issue. The loop produces a dataframe for each unique value in Segment which is exactly the same as subsetting manually with an index selection. You can perform any and all dataframe operations needed on grp before plotting.

Henry Ecker
– Henry Ecker ♦

2021-08-24 03:31:09 +00:00
Commented Aug 24, 2021 at 3:31

| Show 3 more comments

0

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.

Collectives™ on Stack Overflow

Split into dataframes and plot for each in loop? [duplicate]

0

Linked

Hot Network Questions