I have the following dataframe with ("ID", "Month" and "status"). Status is regarding "Churn"= 1 and 'Not Churn" = 2. I want to delete all rows for ID's who are already churned except the first appearance. For example:
Dataframe
ID Month Status
2310 201708 2
2310 201709 2
2310 201710 1
2310 201711 1
2310 201712 1
2310 201801 1
2311 201704 2
2311 201705 2
2311 201706 2
2311 201707 2
2311 201708 2
2311 201709 2
2311 201710 1
2311 201711 1
2311 201712 1
2312 201708 2
2312 201709 2
2312 201710 2
2312 201711 1
2312 201712 1
2312 201801 1
After deleting I should have the following dataframe
ID Month Status
2310 201708 2
2310 201709 2
2310 201710 1
2311 201704 2
2311 201705 2
2311 201706 2
2311 201707 2
2311 201708 2
2311 201709 2
2311 201710 1
2312 201708 2
2312 201709 2
2312 201710 2
2312 201711 1
I tried the following- first to find min date for each customer ID and status=1
df1=df[df.Status==1].groupby('ID')['Month'].min()
then I have to delete all rows for each ID with status 1 greater than min value for MOnth.
2311for status2when it changes to1later on, shouldn't that get dropped