16

I would like to loop over the rows of a DataFrame, in my case to calculate strength ratings for a number of sports teams.

The DataFrame columns 'home_elo' and 'away_elo' contain the pre-match strength rating (ELO score) of the teams involved and are updated in the row of the next home / away match after the match (each team has two strength ratings at any point in time, for home and away games), with what update_elo(a,b,c) returns.

The respective code snippet looks as follows:

for index in df.index:

    counter = counter + 1
    # Calculation of post-match ELO scores for home and away teams
    if df.at[index,'updated'] == 2: # Update next match ELO scores if not yet updated but pre-match ELO scores available

        try:
            all_home_fixtures = df.date_rank[df['localteam_id'] == df.at[index,'localteam_id']]
            next_home_fixture = all_home_fixtures[all_home_fixtures > df.at[index,'date_rank']].min()
            next_home_index = df[(df['date_rank'] == next_home_fixture) & (df['localteam_id'] == df.at[index,'localteam_id'])].index.item()
        except ValueError:
            print('ERROR 1 at' + str(index))
            df.at[index,'updated'] = 4

        try:
            all_away_fixtures = df.date_rank[df['visitorteam_id'] == df.at[index,'visitorteam_id']]
            next_away_fixture = all_away_fixtures[all_away_fixtures > df.at[index,'date_rank']].min()
            next_away_index = df[(df['date_rank'] == next_away_fixture) & (df['visitorteam_id'] == df.at[index,'visitorteam_id'])].index.item()
        except ValueError:
            print('ERROR 2 at' + str(index))
            df.at[index,'updated'] = 4

        # print('Current: ' + str(df.at[index,'fixture_id']) + '; Followed by: ' + str(next_home_fixture))
        # print('Current date rank: ' + str(df.at[index,'date']) + ' ' + str(df.at[index,'date_rank']) + '; Next home date rank: ' + str(df.at[next_home_index,'date_rank']) + '; Next away date rank: ' + str(df.at[next_away_index,'date_rank']))

        df.at[next_home_index, 'home_elo'] = update_elo(df.at[index,'home_elo'],df.at[index,'away_elo'],df.at[index,'actual_score'])
        df.at[next_away_index, 'away_elo'] = update_elo(df.at[index,'away_elo'],df.at[index,'home_elo'],1 - df.at[index,'actual_score']) # Swap function inputs for away team


        df.at[next_home_index, 'updated'] = df.at[next_home_index, 'updated'] + 1
        df.at[next_away_index, 'updated'] = df.at[next_away_index, 'updated'] + 1

        df.at[index,'updated'] = 3

The code works fine for the first couple of rows. I then, however, encounter errors, always for the same rows, even though I cannot see how the rows would differ from others.

  1. If I do not handle the ValueError as shown above, I receive the error message ValueError: can only convert an array of size 1 to a Python scalar for the first time after about 250 rows.
  2. If I do handle the ValueError as shown above, I capture four such errors, two for each of the error-handling blocks (the code works fine otherwise), but the code stops updating any further strength ratings after about 18% of all rows, without throwing any error message.

I would very much appreciate it if you could help me (a) understand what causes the error and (b) how to handle them.

Since this is my first post on StackOverflow, I am not yet fully aware of the common posting practices of the forum. Please let me know if there is anything I can improve about my post.

Thank you very much!

2
  • Which line specifically causes the error? Commented Jul 24, 2018 at 14:25
  • I'm guessing it's the first line in the try...except block. Have you checked that all of your indexes are unique (which seems to be an assumption of your code)? Some of those df.at[index... calls may be returning multiple values if you have an index that is repeated. Try running df.index.nunique() == df.index.shape[0] Commented Jul 24, 2018 at 14:29

3 Answers 3

18

FYI,

You will get similar error if you are applying .item to a numpy array.

You can solve it with .tolist() in that case.

Sign up to request clarification or add additional context in comments.

Comments

6

pd.Series.item requires at least one item in the Series to return a scalar. If:

df[(df['date_rank'] == next_home_fixture) & (df['localteam_id'] == df.at[index,'localteam_id'])]

is a Series with length 0, then the .index.item() will throw a ValueError.

4 Comments

Thank you very much for your reply! If I am not mistaken, that shouldn't be the problem, though. The line you referenced returns (and is supposed to return) a Series containing all home game dates of the respective team. The subsequently calculated next_home_fixture then contains only one element, which is the game date of the next home game.
df.index.nunique() == df.index.shape[0] returns True so that does not seem to be the problem
What line is the error on? How many items does df[(df['date_rank'] == next_home_fixture) & (df['localteam_id'] == df.at[index,'localteam_id'])].index return for the iterations where there's an error?
Thank you, good call! The error is on the line next_home_index = df[(df['date_rank'] == next_home_fixture) & (df['localteam_id'] == df.at[index,'localteam_id'])].index.item() due to df[(df['date_rank'] == next_home_fixture) & (df['localteam_id'] == df.at[index,'localteam_id'])].index returning zero elements despite expecting one
0

This will happen if you have more than 1 item in the series/dataframe that you are doing the .index on: https://pandas.pydata.org/docs/reference/api/pandas.Index.item.html

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.