Filter Pandas DataFrame using another DataFrame

Question

I have a multi-index DataFrame with the first level as the group id and the second level as the element name. There are many more groups but only the first is shown below.

                   2000-01-04  2000-01-05 
Group Element                                     
1       A          -0.011374    0.035895 
        X          -0.006910    0.047714 
        C          -0.016609    0.038705 
        Y          -0.088110   -0.052775 
        H           0.000000    0.008082

I have another DataFrame containing only 1 index that is the group id. The columns for both are the same and they are dates.

         2000-01-04  2000-01-05 
Group                                     
1        -0.060623   -0.025429 
2        -0.066765   -0.005318 
3        -0.034459   -0.011243 
4        -0.051813   -0.019521 
5        -0.064367    0.014810

I want to use the second DataFrame to filter the first one by checking if each element is smaller than the value of the group on that date to get something like this:

                   2000-01-04  2000-01-05 
Group Element                                     
1       A          False        False     
        X          False        False     
        C          False        False     
        Y          True         True
        H          False        False

Ultimately, I am only interested in the elements that were True and the dates in which they were True. A list of elements that were true over an iteration of dates would be great, which I've though to do by making the False NaN and then using dropNa().

I know I can write bunch of nested for loops to do this but time is of crucial importance; I can't think of a way to use pandas dataframe structure intrinsically and pythonically to do this. Any help would greatly appreciated!

Andy Hayden · Accepted Answer · 2014-01-16 23:38:03Z

4

You could use a groupby apply for this:

In [11]: g = df1.groupby(level='Group')

In [12]: g.apply(lambda x: x <= df2.loc[x.name])
Out[12]: 
              2000-01-04 2000-01-05
Group Element                      
1     A            False      False
      X            False      False
      C            False      False
      Y             True       True
      H            False      False

answered Jan 16, 2014 at 23:38

Andy Hayden

378k110 gold badges640 silver badges546 bronze badges

Sign up to request clarification or add additional context in comments.

4 Comments

rmalhotra Over a year ago

Thank you so much! It works great. Just out of interest, df2 values correspond the mean - stdev of each group. I'm basically trying to find outliers. Is there a better way to do this than I am doing now? Also, this is only finding outliers below the threshold; I was planning on just creating another for the upper limits. But is there a more elegant way?

Andy Hayden Over a year ago

@rmalhotra I think there might be, you have access to the group (as x) in the above lambda expression, so you could calculate it then...

rmalhotra Over a year ago

Got it to work to find the below outliers: df.groupby(level=0).apply(lambda x: x < (x.mean() - x.std() * 2)) but when I try doing this: df.groupby(level=0).apply(lambda x: "Below" if x < (x.mean() - x.std() * 2) else "False") I get a value error. Also, would it be possible to have multiple if statements to check for "above" outliers as well?

Andy Hayden Over a year ago

@rmalhotra I think you're better off in creating a separate function (rather than putting it in a lambda) that'll make it easier to test. My suspicion is this is an array being converted to a boolean (which would correctly raise in 0.13), you could use something like you .where x.where((x < (x.mean() - x.std() * 2)), 'Below'). But I recommend using boolean or ints rather than strings. For example: def f(x): mean = x.mean(); std_2 = x.std() * 2; return 1 * (x < mean - std_2) - 1 * (x > mean + std_2)

Collectives™ on Stack Overflow

Filter Pandas DataFrame using another DataFrame

1 Answer 1

4 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

4 Comments

Your Answer

Sign up or log in

Post as a guest

Related