r sum all columns based on one value

Question

Got a df:

ID   Val1    Val2    Val3  
A    1        1       1
A    1        1       1
A    1        1       1
B    0        0       1

I want to take the sum of all the columns, based on a unique ID value. Like this:

ID   Val1    Val2    Val3     
A     3       3       3
B     0       0       1

I tried:

df %>% group_by(ID) %>% summarise_all(funs(sum()))

Anyone have an advice about what I' m doing wrong? I prefer a dplyr approach (if possible).

The question is not merely a typographical issue, and highlights a potentially confusing aspect of the R language, in that functions can be invoked both with and without parentheses. In many cases, this difference has no effect (both count and count() will work as part of a dplyr chain). But in certain cases, the presence of parenthesis makes the difference between "call the function" versus "pass the function as an argument", as explained in the answer below. — jdobres
– jdobres, Commented Jan 16, 2018 at 15:36
Why load a package? aggregate(. ~ ID, df, sum) does it. Find simpler. — Rui Barradas
– Rui Barradas, Commented Jan 16, 2018 at 15:40

jdobres · Accepted Answer · 2018-01-16 15:09:53Z

3

You need to remove the parentheses after sum, i.e., your code should read:

df %>% group_by(ID) %>% summarise_all(funs(sum))

Typing sum() in this case calls the function, whereas passing just the name of the function sends it to be used by summarise_all. It's the difference between saying "use this function here and now," versus, "pass the function, as a parameter, to some other function". Similarly, typing, ?sum will bring you the documentation for the function, but ?sum() is invalid.

edited Jan 16, 2018 at 15:09

answered Jan 16, 2018 at 13:30

jdobres

12k1 gold badge20 silver badges41 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

elia Over a year ago

Ah, I was so close. This is the answer that I was looking for! TY.

J Louro · Accepted Answer · 2018-01-16 15:46:54Z

0

Edited*:

I don't know a solution using dplyr, but I do another solution using plyr, if interested.

You have:

   df=data.frame(id=c("A","A","A","B"), Val1=c(1,1,1,0), Val2=c(1,1,1,0),Val3=c(1,1,1,1))

> df
  id Val1 Val2 Val3
1  A    1    1    1
2  A    1    1    1
3  A    1    1    1
4  B    0    0    1

Using the plyr libray

> library(plyr)

> ddply(df,.(id),summarize,Val1=sum(Val1),Val2=sum(Val2),Val3=sum(Val3))

Output:

  id Val1 Val2 Val3
1  A    3    3    3
2  B    0    0    1

edited Jan 16, 2018 at 15:46

answered Jan 16, 2018 at 13:32

J Louro

331 silver badge7 bronze badges

1 Comment

talat Over a year ago

No problem, it's generally fine and accepted to post answers using different methods. I just found the wording a little surprising. I'll delete my comment

Collectives™ on Stack Overflow

r sum all columns based on one value

2 Answers 2

1 Comment

1 Comment

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

1 Comment

1 Comment

Your Answer

Sign up or log in

Post as a guest

Linked

Related