Working on a subset of a dataframe with column condition

Question

From a dataframe df I want to update the value of a column Points for the top 3 values of another column Time after sorting the Time column in ascending order, such that

df['Points'] = df['Points'] * 1.3 for the first row (smallest Time)

df['Points'] = df['Points'] * 1.2 for the second row (second smallest Time)

df['Points'] = df['Points'] * 1.1 for the third row (third smallest Time) rounded to the nearest integer.

and Points for all other rows remains the same.

I have to do this for every unique value for a third column value Challenge. How can I do this?

So, I need PointsA instead of Points from below -

Challenge      Team              Time              Points   PointsA 
   A             1    2019-11-05 23:00:43.07589     200       260
   B             3    2019-11-05 22:10:55.07589     100       130
   A             5    2019-11-05 23:05:43.07589     200       240
   A             7    2019-11-05 23:07:33.07589     200       220
   B            10    2019-11-05 22:20:13.07589     100       120
   C             4    2019-11-06 00:05:22.07589      50        65
   A             4    2019-11-05 23:18:23.07589     200       200

I've tried something like -

for challenge in df['Challenge'].unique():
     df[df['Challenge'] == challenge].sort_values('Time', ascending=True).head(1)['Points'] *= 1.3

but that doesn't seem to work.

Create minimal working code with example data so we could run it. — sammywemmy
– sammywemmy, Commented Jan 16, 2020 at 4:57

Andy L. · Accepted Answer · 2020-01-16 06:28:20Z

1

Try this. Use value_counts and items to get each challenge and length of them. Use these length to narrow on assignment of challenge

val = [1.3, 1.2, 1.1]
df.Time = pd.to_datetime(df.Time)
for challenge, i in df['Challenge'].value_counts().items():
    df.loc[df[df['Challenge'] == challenge].nsmallest(3, 'Time').index, 'Points'] *= val[:i]

Out[201]:
  Challenge  Team                       Time  Points  PointsA
0         A     1 2019-11-05 23:00:43.075890   260.0       260
1         B     3 2019-11-05 22:10:55.075890   130.0       130
2         A     5 2019-11-05 23:05:43.075890   240.0       240
3         A     7 2019-11-05 23:07:33.075890   220.0       220
4         B    10 2019-11-05 22:20:13.075890   120.0       120
5         C     4 2019-11-06 00:05:22.075890    65.0        65
6         A     4 2019-11-05 23:18:23.075890   200.0       200

As Challenge = 'C' has one row and it got calculated correctly from 50 to 65

edited Jan 16, 2020 at 6:28

answered Jan 16, 2020 at 5:13

Andy L.

25.3k4 gold badges20 silver badges30 bronze badges

Sign up to request clarification or add additional context in comments.

4 Comments

fmarm Over a year ago

Smart solution, way more elegant than mine, but you only update points for top 3 in total and not by challenge, is there a way to use the same kind of technique with a groupby?

harry04 Over a year ago

I though this was more elegant too, and tried using it with for challenge in df['challenge'].unique(): but it doesn't give the desired result..

harry04 Over a year ago

@AndyL. how do I modify this to include challenges having only one row?

Andy L. Over a year ago

@harry04: I changed to value_counts and items and slice on each challenge. It should work on challenge having less than 3 rows

fmarm · Accepted Answer · 2020-01-16 05:29:49Z

1

Here is a way to do it

import pandas as pd
import numpy as np

# compute rank by challenge
df['rank_in_challenge'] = df.groupby('Challenge')['Time'].rank(method='first',ascending=True).astype('int')
# apply change in points
conditions  = [ df['rank_in_challenge']==1,df['rank_in_challenge']==2,df['rank_in_challenge']==3]
choices     = [ 1.3, 1.2, 1.1 ]
df["PointsA"] = np.select(conditions, choices, default=1.0)*df['Points']

edited Jan 16, 2020 at 5:29

answered Jan 16, 2020 at 5:13

fmarm

4,2741 gold badge20 silver badges30 bronze badges

Collectives™ on Stack Overflow

Working on a subset of a dataframe with column condition

2 Answers 2

4 Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

4 Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related