0

When I run this:

import pandas as pd

data = {'id': ['earn', 'earn','lose', 'earn'],
    'game': ['darts', 'balloons', 'balloons', 'darts']
    }

df = pd.DataFrame(data)
print(df)
print(df.loc[[1],['id']] == 'earn')

The output is:
id game
0 earn darts
1 earn balloons
2 lose balloons
3 earn darts
id
1 True

But when I try to run this loop:

for i in range(len(df)):  
     if (df.loc[[i],['id']] == 'earn'):  
         print('yes')  
     else:  
         print('no')

I get the error 'ValueError: The truth value of a DataFrame is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().' I am not sure what the problem is. Any help or advice is appreciated -- I am just starting.

I expected the output to be 'yes' from the loop. But I just got the 'ValueError' message. But, when I run the condition by itself, the output is 'True' so I'm not sure what is wrong.

8
  • But, when I run the condition by itself, the output is 'True' Show us that code. Commented Dec 3, 2022 at 1:49
  • Will len(df) return a number or is it df.size that's needed? Is what caught my eyes first. Commented Dec 3, 2022 at 1:57
  • I haven't used dataframes at all, so maybe some subtlety is escaping me, but those two expressions look identical to me. I don't understand why one would work but not the other. Is df the same in both those examples? Commented Dec 3, 2022 at 1:58
  • Please post a working example - a small, simple dataframe, likely a few rows and an "id" column - the code and the output you get. Commented Dec 3, 2022 at 2:04
  • import pandas as pd data = {'id': ['earn', 'earn','lose', 'earn'], 'game': ['darts', 'balloons', 'balloons', 'darts'] } df = pd.DataFrame(data) print(df) print(df.loc[[1],['id']] == 'earn') Commented Dec 3, 2022 at 2:06

2 Answers 2

1
for i,row in df.iterrows():
    if row.id == "earn":
        print("yes")
Sign up to request clarification or add additional context in comments.

Comments

0

Its complicated. pandas is geared towards operating on entire groups of data, not individual cells. df.loc may create a new DataFrame, a Series or a single value, depending on how its indexed. And those produce DataFrame, Series or scalar results for the == comparison.

If the indexers are both lists, you get a new DataFrame and the compare is also a dataframe

>>> foo = df.loc[[1], ['id']]
>>> type(foo)
<class 'pandas.core.frame.DataFrame'>
>>> foo
     id
1  earn
>>> foo == "earn"
     id
1  True

If one indexer is scalar, you get a new Series

>>> foo = df.loc[[1], 'id']
>>> type(foo)
<class 'pandas.core.series.Series'>
>>> foo
1    earn
Name: id, dtype: object
>>> foo == 'earn'
1    True
Name: id, dtype: bool

If both indexers are scalar, you get a single cell's value

>>> foo = df.loc[1, 'id']
>>> type(foo)
<class 'str'>
>>> foo
'earn'
>>> foo == 'earn'
True

That last is the one you want. The first two produce containers where True is ambiguous (you need to decide if any or all values need to be True).

for i in range(len(df)):  
     if (df.loc[i,'id'] == 'earn'):  
         print('yes')  
     else:  
         print('no')

Or maybe not. Depending on what you intend to do next, create a series of boolean values for all of the rows at once

>>> earn = df[id'] == 'earn'
>>> earn
0     True
1     True
2    False
3     True
Name: id, dtype: bool

now you can continue to make calculations on the dataframe as a whole.

2 Comments

THANK YOU! I had to re-read your comment 5 times before I understood it (because I am such a beginner, not because you wrote poorly) haha. But your explanation was super helpful thank you so much for writing all of that out!
@PrincessPeach pandas is that way. Its a heavy lift, but once you know the ins and outs, you can get a lot done.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.