Pandas data frame - filter rows by values from another data frame

Question

Assume I have two pandas data frames, that their relevant columns are:

stimuli data frame:

   stimuli_id    rank         
0     23          0  
1     27          1 
2     62          2 
3     88          2 
4     99          1

while 'stimuli_id' is a unique index, and 'rank' is a integer in range of [0,2]. Relevant columns from trials data frame is:

     stim1     stim2        
0     23         27
1     27         62   
2     62         99

While both stim1 and stim2 represent stimuli_id from stimuli data frame.

Now I want to filter all rows in trials data frame where the rank of the second stimuli is greater. So the example above after filtering should look like this:

       stim1     stim2        
0       62         99

So eventually only for this trial stim1 is greater than stim2, and the rest are not so we filter them.

I have tried the following:

trials.loc[stimuli.loc[stimuli["stimuli id"] == trials["stim1"]].iloc[0]["rank"] > stimuli.loc[stimuli["stimuli id"] == trials["stim2"]].iloc[0]["rank"]]

But a value error has been raised:

{ValueError}Can only compare identically-labeled Series objects

I have been searching for hours for any solution but found nothing helpful.

ALollz · Accepted Answer · 2021-02-04 19:41:41Z

3

Since 'stimuli_id' is a unique key for that DataFrame, use the Series to map the different stim columns to ranks and check the comparison. (By "rank of the second stimuli is greater" I assume you mean a smaller number).

s = stimuli.set_index('stimuli_id')['rank']

trials[trials['stim2'].map(s) < trials['stim1'].map(s)]
#   stim1  stim2
#2     62     99

With mapping each column we are logically creating the mask with the following comparison:

#rank2      rank1
#    1   <      0   # False
#    2   <      1   # False
#    1   <      2   # True

edited Feb 4, 2021 at 19:41

answered Feb 4, 2021 at 19:34

ALollz

59.7k7 gold badges73 silver badges97 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

Alina Over a year ago

This is great, thank you. However, if I want to keep those rows where stim1 has higher rank, but also stim1's rank is 2, how can I do that with this approach? I have tried adding this condition to the trials filtering but it didn't work.

ALollz Over a year ago

@AlinaRya In that case you an add on an additional condition, the mask would be (trials['stim2'].map(s) < trials['stim1'].map(s)) & trials['stim1'].map(s).eq(2) So the first condition and the rank of the first is 2

Collectives™ on Stack Overflow

Pandas data frame - filter rows by values from another data frame

1 Answer 1

2 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

2 Comments

Your Answer

Sign up or log in

Post as a guest

Related