Extract rows from pandas Dataframe based on column value

Question

How would I go about extracting rows where a column matches a specific value from a Dataframe created from an excel file?

Here are few rows from the Dataframe:

    Food            Men     Women
0   Total fruit     86.20   88.26
1   Apples, Total   89.01   89.66
2   Apples as fruit 89.18   90.42
3   Apple juice     88.78   88.42
4   Bananas         95.42   94.18
5   Berries         84.21   81.73
6   Grapes          88.79   88.13

and this is the code that I used to read the excel file, choosing the columns I needed and renaming them appropriately:

data1= pd.read_excel('USFoodCommodity.xls', sheetname='94-98 FAH', skiprows=76,skip_footer=142, parse_cols='A, H, K')
data1.columns = ['Food', 'Men', 'Women']

# Try 1: data1 = data1[data1['Food'].isin(['Total fruit']) == True] works
# Try 2: data1 = data1[data1['Food'].isin(['Apple, Total']) == True] doesn't work
# Try 3: data1 = data1.iloc[[1]] returns Apples, Total but not appropriate to use integer index
# Try 4: data1[data1['Food'] == 'Berries'] doesn't work

So far based on the answers such as this, this, or here I have only been able to return the first index where Food = "Total fruit". When I try the other methods above I only get the column names such as:

Food    Men Women

I am new to pandas and can't see where I am going wrong. Why can I extract the first row where Food == Total fruit but not anything else?

jezrael · Accepted Answer · 2017-02-09 06:59:12Z

2

For me it works nice, maybe problem with some whitespaces - remove them by strip:

print (data1.Food.tolist())
['Total fruit', 'Apples, Total ', 'Apples as fruit', 
'Apple juice', 'Bananas', ' Berries', 'Grapes']

data1['Food'] = data1['Food'].str.strip()

print (data1.Food.tolist())
['Total fruit', 'Apples, Total', 'Apples as fruit', 
'Apple juice', 'Bananas', 'Berries', 'Grapes']

data2 = data1[data1['Food'].isin(['Total fruit'])]
print (data2)
          Food   Men  Women
0  Total fruit  86.2  88.26

data3 = data1[data1['Food'].isin(['Apples, Total'])]
print (data3)
            Food    Men  Women
1  Apples, Total  89.01  89.66

data3 = data1[data1['Food'].isin(['Berries'])]
print (data3)
      Food    Men  Women
5  Berries  84.21  81.73

answered Feb 9, 2017 at 6:59

jezrael

868k103 gold badges1.4k silver badges1.3k bronze badges

Sign up to request clarification or add additional context in comments.

Comments

Arjun · Accepted Answer · 2017-02-09 06:16:30Z

0

Use this code

data1= pd.read_excel('USFoodCommodity.xls', sheetname='94-98 FAH', skiprows=76,skip_footer=142, parse_cols='A, H, K')
list_of_strings_to_match = ['Total fruit', 'Berries', 'Grape']
for index, row in data1.iterrows():
   if row['Food'] in list_of_strings_to_match:
      print row

answered Feb 9, 2017 at 6:16

Arjun

3253 silver badges12 bronze badges

1 Comment

dreamin Over a year ago

No row results for Berries or Grapes

Arpan Srivastava · Accepted Answer · 2020-05-19 10:03:59Z

0

This question might be old, but here is an easier and intuitive way.

Note: This solution only works on pandas >= 0.13.

You can now select columns from a dataframe using the .query() method.

It's as simple as:

df.query('column == value') # The comparison operator can be anything.

For example, in your case, you could query like so:

data1.query('Food == "Total Fruit"')

or

data1.query('Food == Berries')

To access a variable, use @.

fruit = "berries"
data1.query('Food == @fruit')

You can even club multiple conditions using &.

data1.query('condition1 == value1 & condition2 == value2')

Hope it helped.

answered May 19, 2020 at 10:03

Arpan Srivastava

3795 silver badges12 bronze badges

Collectives™ on Stack Overflow

Extract rows from pandas Dataframe based on column value

3 Answers 3

Comments

1 Comment

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

Comments

1 Comment

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related