Python (pandas) - sum multiple columns based on one column

Question

I have a dataframe of Covid-19 deaths by country. Countries are identified in the Country column. Sub-national classification is based on the Province column.

I want to generate a dataframe which sums all columns based on the value in the Country column (except the first 2, which are geographical data). In short, for each date, I want to compress the observations for all provinces of a country such that I get a single number for each country.

Right now, I am able to do that for a single date:

import pandas as pd

url = 'https://raw.githubusercontent.com/CSSEGISandData/COVID- 
19/master/csse_covid_19_data/csse_covid_19_time_series/time_series_covid19_deaths_global.csv'
raw = pd.read_csv(url)
del raw['Lat']
del raw['Long']
raw.rename({'Country/Region': 'Country', 'Province/State': 'Province'}, axis=1, inplace=True)

raw2 = raw.groupby('Country')['6/29/20'].sum()

How can I achieve this for all dates?

If it's a list, you can use raw.Country.unique()

r-beginners
– r-beginners

2020-07-01 04:12:16 +00:00
Commented Jul 1, 2020 at 4:12 — r-beginners
– r-beginners, Commented Jul 1, 2020 at 4:12

Quang Hoang · Accepted Answer · 2020-07-01 04:12:43Z

1

You can use iloc:

raw2 = raw.iloc[:,4:].groupby(raw.Country).sum()

answered Jul 1, 2020 at 4:12

Quang Hoang

151k11 gold badges64 silver badges86 bronze badges

Sign up to request clarification or add additional context in comments.

Collectives™ on Stack Overflow

Python (pandas) - sum multiple columns based on one column

1 Answer 1

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Related