0

I am trying to make a pivot table of a large .tsv data set in R and exporting it back to Excel.

I tried using the dplyer functions:

summary <- df %>%
group_by(Run,Prot) %>%
summarize(count_by_Id = n()) %>%
as.data.frame()

This almost works, but rows with e.g. "P61981;P62258" and "P62258" in the Prot column are counted together. How do I make R only summarize rows that have exactly the same strings in the Prot columns. So that in case of the above example there will be two different rows (for "P61981;P62258" and "P62258") in the summary data I am creating.

10
  • Can you provide dput(df)? Commented Mar 10, 2023 at 3:08
  • 5
    Based on your code, it won't be counted as same if they are different strings. Make sure to use dplyr::summarise assuming the summarise is not getting masked by plyr::summarise Commented Mar 10, 2023 at 3:12
  • Does this answer your question? Why does summarize or mutate not work with group_by when I load `plyr` after `dplyr`? Commented Mar 10, 2023 at 9:33
  • Thank you all for the feedback; I am new to R, and can definitely use some help. @jrcalabrese: see below. There are more rows with 0's above what I copied, and more lines below with other variables. I can only copy a few because of the character limit 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0)), row.names = c(NA, -305177L), spec = structure(list(cols = list(File.Name = structure(list(), class = c("collector_character", "collector")), Run = structure(list(), class = c("collector_character", "collector")), Prot = structure(list(), class = c("collector_character", Commented Mar 10, 2023 at 16:37
  • @akrun : I tried this, but it did not make any difference. Commented Mar 10, 2023 at 16:40

0

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.