1

I have a column with different strings length that are separated by ",". I want to split each rows of this column to separate columns and fill the missing values with "NA", and for each string count the number of the frequency. Here is a samples example:

M <- data.frame(name = c("A", "B", "C"), mapped = c("X1, X3, X4", "X2, X4", "X2,X3, X4"))
  name     mapped
1    A X1, X3, X4
2    B     X2, X4
3    C  X2,X3, X4

I want to get the resulting data-frame like:

df <- data.frame(name = c("A","B", "C"), V1 = c("X1","NA", "NA"), V2 = c("NA", "X2","X2"), V3 = c("X3","NA", "X3"), V4 = c("X4","X4", "X4"))

  name V1 V2 V3 V4
1    A X1 NA X3 X4
2    B NA X2 NA X4
3    C NA X2 X3 X4

Then count the number of X1, X2, X3 and X4 for each column of new data-frame.

Thank you!

2 Answers 2

4

You could use separate_rows and pivot_wider:

library(tidyverse)

M %>% 
  separate_rows(mapped) %>% 
  pivot_wider(names_from = mapped, values_from = mapped) %>% 
  relocate(order(colnames(.)))

# A tibble: 3 x 5
  name  X1    X2    X3    X4   
  <chr> <chr> <chr> <chr> <chr>
1 A     X1    NA    X3    X4   
2 B     NA    X2    NA    X4   
3 C     NA    X2    X3    X4   

Then to count the number of values per column, use :

colSums(!is.na(M[,-1]))
# X1 X2 X3 X4 
#  1  2  2  3
Sign up to request clarification or add additional context in comments.

Comments

0

Split on comma, unlist, then count:

table(unlist(strsplit(M$mapped, ",")))
# X1 X2 X3 X4 
#  1  2  2  3 

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.