Python: 'ValueError: can only convert an array of size 1 to a Python scalar' when looping over rows in pd.DataFrame

Question

I would like to loop over the rows of a DataFrame, in my case to calculate strength ratings for a number of sports teams.

The DataFrame columns 'home_elo' and 'away_elo' contain the pre-match strength rating (ELO score) of the teams involved and are updated in the row of the next home / away match after the match (each team has two strength ratings at any point in time, for home and away games), with what update_elo(a,b,c) returns.

The respective code snippet looks as follows:

for index in df.index:

    counter = counter + 1
    # Calculation of post-match ELO scores for home and away teams
    if df.at[index,'updated'] == 2: # Update next match ELO scores if not yet updated but pre-match ELO scores available

        try:
            all_home_fixtures = df.date_rank[df['localteam_id'] == df.at[index,'localteam_id']]
            next_home_fixture = all_home_fixtures[all_home_fixtures > df.at[index,'date_rank']].min()
            next_home_index = df[(df['date_rank'] == next_home_fixture) & (df['localteam_id'] == df.at[index,'localteam_id'])].index.item()
        except ValueError:
            print('ERROR 1 at' + str(index))
            df.at[index,'updated'] = 4

        try:
            all_away_fixtures = df.date_rank[df['visitorteam_id'] == df.at[index,'visitorteam_id']]
            next_away_fixture = all_away_fixtures[all_away_fixtures > df.at[index,'date_rank']].min()
            next_away_index = df[(df['date_rank'] == next_away_fixture) & (df['visitorteam_id'] == df.at[index,'visitorteam_id'])].index.item()
        except ValueError:
            print('ERROR 2 at' + str(index))
            df.at[index,'updated'] = 4

        # print('Current: ' + str(df.at[index,'fixture_id']) + '; Followed by: ' + str(next_home_fixture))
        # print('Current date rank: ' + str(df.at[index,'date']) + ' ' + str(df.at[index,'date_rank']) + '; Next home date rank: ' + str(df.at[next_home_index,'date_rank']) + '; Next away date rank: ' + str(df.at[next_away_index,'date_rank']))

        df.at[next_home_index, 'home_elo'] = update_elo(df.at[index,'home_elo'],df.at[index,'away_elo'],df.at[index,'actual_score'])
        df.at[next_away_index, 'away_elo'] = update_elo(df.at[index,'away_elo'],df.at[index,'home_elo'],1 - df.at[index,'actual_score']) # Swap function inputs for away team


        df.at[next_home_index, 'updated'] = df.at[next_home_index, 'updated'] + 1
        df.at[next_away_index, 'updated'] = df.at[next_away_index, 'updated'] + 1

        df.at[index,'updated'] = 3

The code works fine for the first couple of rows. I then, however, encounter errors, always for the same rows, even though I cannot see how the rows would differ from others.

If I do not handle the ValueError as shown above, I receive the error message ValueError: can only convert an array of size 1 to a Python scalar for the first time after about 250 rows.
If I do handle the ValueError as shown above, I capture four such errors, two for each of the error-handling blocks (the code works fine otherwise), but the code stops updating any further strength ratings after about 18% of all rows, without throwing any error message.

I would very much appreciate it if you could help me (a) understand what causes the error and (b) how to handle them.

Since this is my first post on StackOverflow, I am not yet fully aware of the common posting practices of the forum. Please let me know if there is anything I can improve about my post.

Thank you very much!

I'm guessing it's the first line in the try...except block. Have you checked that all of your indexes are unique (which seems to be an assumption of your code)? Some of those df.at[index... calls may be returning multiple values if you have an index that is repeated. Try running df.index.nunique() == df.index.shape[0] — sundance
– sundance, Commented Jul 24, 2018 at 14:29

Wei Chen · Accepted Answer · 2020-04-07 02:30:23Z

18

FYI,

You will get similar error if you are applying .item to a numpy array.

You can solve it with .tolist() in that case.

answered Apr 7, 2020 at 2:30

Wei Chen

6256 silver badges14 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

sundance · Accepted Answer · 2018-07-24 16:06:34Z

6

pd.Series.item requires at least one item in the Series to return a scalar. If:

df[(df['date_rank'] == next_home_fixture) & (df['localteam_id'] == df.at[index,'localteam_id'])]

is a Series with length 0, then the .index.item() will throw a ValueError.

edited Jul 24, 2018 at 16:06

answered Jul 24, 2018 at 14:32

sundance

2,9554 gold badges23 silver badges31 bronze badges

4 Comments

parno Over a year ago

Thank you very much for your reply! If I am not mistaken, that shouldn't be the problem, though. The line you referenced returns (and is supposed to return) a Series containing all home game dates of the respective team. The subsequently calculated next_home_fixture then contains only one element, which is the game date of the next home game.

parno Over a year ago

df.index.nunique() == df.index.shape[0] returns True so that does not seem to be the problem

sundance Over a year ago

What line is the error on? How many items does df[(df['date_rank'] == next_home_fixture) & (df['localteam_id'] == df.at[index,'localteam_id'])].index return for the iterations where there's an error?

parno Over a year ago

Thank you, good call! The error is on the line

next_home_index = df[(df['date_rank'] == next_home_fixture) & (df['localteam_id'] == df.at[index,'localteam_id'])].index.item()

due to df[(df['date_rank'] == next_home_fixture) & (df['localteam_id'] == df.at[index,'localteam_id'])].index returning zero elements despite expecting one

grantr · Accepted Answer · 2023-10-02 19:15:50Z

0

This will happen if you have more than 1 item in the series/dataframe that you are doing the .index on: https://pandas.pydata.org/docs/reference/api/pandas.Index.item.html

answered Oct 2, 2023 at 19:15

grantr

1,07614 silver badges17 bronze badges

Collectives™ on Stack Overflow

Python: 'ValueError: can only convert an array of size 1 to a Python scalar' when looping over rows in pd.DataFrame

3 Answers 3

Comments

4 Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

Comments

4 Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related