0

I have a list of identical dataframes and I am trying to sum one column in each dataframe in the list. My thought is something like total = [df['A'].sum for df in dfs] but this returns a list of length dfs containing only the value method. My desired output is a list of the column sum for each dataframe. What is the fastest way to achieve this goal? I have to repeat this sum thousands of times per list on thousands of different lists.

1 Answer 1

0

Perhaps, you are missing () after sum

 total = [df['A'].sum() for df in dfs]

You want to call the method sum not just reference it.

Python sum is pretty quick: Python built-in sum function vs. for loop performance and I assume that pandas sum should be comparable. Difference between sum, 'sum' and np.sum *under the hood* (Python / Pandas / Numpy)

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.