How do I make summarize(count_by_) in dplyr only count instances if there is an exact match?

I am trying to make a pivot table of a large .tsv data set in R and exporting it back to Excel.

I tried using the dplyer functions:

summary <- df %>%
group_by(Run,Prot) %>%
summarize(count_by_Id = n()) %>%
as.data.frame()

This almost works, but rows with e.g. "P61981;P62258" and "P62258" in the Prot column are counted together. How do I make R only summarize rows that have exactly the same strings in the Prot columns. So that in case of the above example there will be two different rows (for "P61981;P62258" and "P62258") in the summary data I am creating.

edited Mar 10, 2023 at 3:08

jrcalabrese

2,3713 gold badges13 silver badges38 bronze badges

asked Mar 10, 2023 at 2:57

RvS

Can you provide dput(df)?

jrcalabrese
– jrcalabrese

2023-03-10 03:08:50 +00:00
Commented Mar 10, 2023 at 3:08
5

Based on your code, it won't be counted as same if they are different strings. Make sure to use dplyr::summarise assuming the summarise is not getting masked by plyr::summarise

akrun
– akrun

2023-03-10 03:12:06 +00:00
Commented Mar 10, 2023 at 3:12
Does this answer your question? Why does summarize or mutate not work with group_by when I load `plyr` after `dplyr`?

Meisam
– Meisam

2023-03-10 09:33:45 +00:00
Commented Mar 10, 2023 at 9:33
Thank you all for the feedback; I am new to R, and can definitely use some help. @jrcalabrese: see below. There are more rows with 0's above what I copied, and more lines below with other variables. I can only copy a few because of the character limit 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0)), row.names = c(NA, -305177L), spec = structure(list(cols = list(File.Name = structure(list(), class = c("collector_character", "collector")), Run = structure(list(), class = c("collector_character", "collector")), Prot = structure(list(), class = c("collector_character",

RvS
– RvS

2023-03-10 16:37:24 +00:00
Commented Mar 10, 2023 at 16:37
@akrun : I tried this, but it did not make any difference.

RvS
– RvS

2023-03-10 16:40:45 +00:00
Commented Mar 10, 2023 at 16:40

| Show 5 more comments

0 Your Answer

Sign up or log in

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.

Collectives™ on Stack Overflow

How do I make summarize(count_by_) in dplyr only count instances if there is an exact match?

0

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

0

Know someone who can answer? Share a link to this question via email, Twitter, or Facebook.

Your Answer

Sign up or log in

Post as a guest

Linked