0

When I search for my problem all I keep finding is grouping the same values together but I am looking to get the value of each team and add them together

Eg this is my Dataframe

0   Liverpool        4        
1   West Ham         0
2   Bournemouth      1
3   Burnley          3
4   Crystal Palace   0
5   Liverpool        6
6   West Ham         2
7   Bournemouth      8
8   Burnley          1
9   Crystal Palace   4

All the examples I see online is just grouping them together

eg

0   Liverpool        4 
1   Liverpool        5       
2   West Ham         2
3   West Ham         1
4   Crystal Palace   4
5   Crystal Palace   1

but what I am after is in order of high to low

0   Liverpool        9 
1   Crystal Palace   5
2   West Ham         3

enter image description here

6
  • 4
    df.groupby('club')['goals'].sum() Commented Dec 1, 2021 at 12:56
  • how are you getting the numbers you want in the righthand column? I see in the main df that liverloop has 6 and 4, so sum to 10; Crystal Palace has 4 and 0, sum to 4, and West Ham has 2 and 0, sums to 0. all different than your desired output. Commented Dec 1, 2021 at 12:56
  • what @Drecker said + .sort_values(ascending=False) Commented Dec 1, 2021 at 13:00
  • @Drecker. I did that there but it seemed to put all 19 games goals for a given team together with out adding them. added image above. Commented Dec 1, 2021 at 13:10
  • what is the output of match_data1.dtypes? From your updated output, it looks like your 'FTHG' isn't numerical, but strings and so won't sum. Commented Dec 1, 2021 at 13:14

1 Answer 1

1

From what you getting from by grouping and summing, the results almost surely tells that you have col FTHG as string, since sum() operation appends strings to other strings, you get string concat at the end rather than summed value. Try following:

match_data1["FTHG"] = match_data1.astype(int)
match_data1.groupby("HomeTeam")["FTHG"].sum().sort_values(ascending=False)

EDIT: After @Emi OB's comment. If column "FTHG" is nullable, then use float conversion, and fill na before sum (or ignore them afterwards), you can also use nansum approach which is discussed here.

match_data1["FTHG"] = match_data1.astype(float)
match_data1.groupby("HomeTeam")["FTHG"].fillna(0.0).sum().sort_values(ascending=False)
Sign up to request clarification or add additional context in comments.

5 Comments

looking at the output in the question for Liverpool, there appears to be a 'NP' value in the FTHG column, which can't be converted to int, so I would guess these values would need to be dealt with first.
Yes you are correct, float would be a better convertion. I update the answer accordingly.
I don't think float will update NP either, not really sure how that would be handled. Looking at the comments on the question it appears OP has used this to fix their problem anyways, so maybe the NP isn't an issue
@EmiOB Ya I fixed the Np, I know its not the best way to do it but it was only on 1 row so I went in to the csv file and changed it to 0
@AndroidDev I updated the answer in case if the column is nullable.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.