2

I have a pandas dataframe on which I wish to apply my own custom rolling function as follows:

def testms(x, field):
    mu = np.sum(x[field])
    si = np.sum(x[field])/len(x[field])
    x['mu'] = mu
    x['si'] = si
    return x

df2 = pd.concat([pd.DataFrame({'A':[1,1,1,1,1,2,2,2,2,2]}),
      pd.DataFrame({'B':random_dates(pd.to_datetime('2015-01-01'), 
        pd.to_datetime('2018-01-01'), 10)}),
      pd.DataFrame({'C':np.random.rand(10)})],axis=1)

df2
   A                             B         C
0  1 2016-08-25 01:09:42.953011200  0.791725
1  1 2017-02-23 13:30:20.296310399  0.528895
2  1 2016-10-23 05:33:14.994806400  0.568045
3  1 2016-08-20 17:41:03.991027200  0.925597
4  1 2016-04-09 17:59:00.805200000  0.071036
5  2 2016-12-09 13:06:00.751737600  0.087129
6  2 2016-04-25 00:47:45.953232000  0.020218
7  2 2017-09-05 06:35:58.432531200  0.832620
8  2 2017-11-23 03:18:47.370528000  0.778157
9  2 2016-02-25 15:14:53.907532800  0.870012

tester = lambda x: testms(x, 'C')
df2.set_index('B').groupby('A')['C'].rolling('90D', min_periods=1).apply(tester).reset_index()

However when I apply the above code, I get the following error:

IndexError: only integers, slices (`:`), ellipsis (`...`), numpy.newaxis (`None`) and integer or boolean arrays are valid indices
1
  • 2
    I believe rolling().apply() only accept functions that returns scalars. Commented Nov 10, 2019 at 6:32

1 Answer 1

1

If use Rolling.apply it working differently like GroupBy.apply - it processing each columns separately and not possible return multiple columns, only scalars:

So in your solution are necessary 2 functions, where is not possible specify column, but column for processing is specify after groupby:

def testms1(x):
    mu = np.sum(x)
    return mu

def testms2(x):
    #same like mean
    #si = np.sum(x)/len(x)
    si = np.mean(x)
    return si


tester1 = lambda x: testms1(x)
tester2 = lambda x: testms2(x)
r = df2.set_index('B').groupby('A')['C'].rolling('90D', min_periods=1)
s1 = r.apply(tester1, raw=False).rename('mu')
s2 = r.apply(tester2, raw=False).rename('si')
df = pd.concat([s1, s2], axis=1).reset_index()
print (df)
   A                             B        mu        si
0  1 2016-08-25 01:09:42.953011200  0.791725  0.791725
1  1 2017-02-23 13:30:20.296310399  0.528895  0.528895
2  1 2016-10-23 05:33:14.994806400  1.096940  0.548470
3  1 2016-08-20 17:41:03.991027200  2.022537  0.674179
4  1 2016-04-09 17:59:00.805200000  2.093573  0.523393
5  2 2016-12-09 13:06:00.751737600  0.087129  0.087129
6  2 2016-04-25 00:47:45.953232000  0.107347  0.053673
7  2 2017-09-05 06:35:58.432531200  0.832620  0.832620
8  2 2017-11-23 03:18:47.370528000  1.610777  0.805389
9  2 2016-02-25 15:14:53.907532800  2.480789  0.826930

Alternative solution with Resampler.aggregate:

r = df2.set_index('B').groupby('A')['C'].rolling('90D', min_periods=1)
df1 = r.agg(['sum','mean']).rename(columns={'sum':'mu', 'mean':'si'}).reset_index()
print (df1)
   A                             B        mu        si
0  1 2016-08-25 01:09:42.953011200  0.791725  0.791725
1  1 2017-02-23 13:30:20.296310399  0.528895  0.528895
2  1 2016-10-23 05:33:14.994806400  1.096940  0.548470
3  1 2016-08-20 17:41:03.991027200  2.022537  0.674179
4  1 2016-04-09 17:59:00.805200000  2.093573  0.523393
5  2 2016-12-09 13:06:00.751737600  0.087129  0.087129
6  2 2016-04-25 00:47:45.953232000  0.107347  0.053673
7  2 2017-09-05 06:35:58.432531200  0.832620  0.832620
8  2 2017-11-23 03:18:47.370528000  1.610777  0.805389
9  2 2016-02-25 15:14:53.907532800  2.480789  0.826930
Sign up to request clarification or add additional context in comments.

2 Comments

Thanks for this. I am looking for a custom function, as in your first solution above. However, I would prefer if we could return multiple columns instead of creating separate functions for each return value. Is this possible?
@nrcjea001 - unfortunately not, because rolling return scalar only.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.