Replacing strings in dataframe with column name in R

Question

I want to replace strings in an R dataframe. The dataframe shows per production order (rows) when a resource / production step (columns) was utilized. For this particular analysis the time values are not needed, instead I want to have the column name in place of the timestamp.

The data looks something like this

df_current <- data.frame(
    Prod.order = seq(123, 127),
    B100 = c("01:00:00", "02:00:00", "03:00:00", "04:00:00", "05:00:00"),
    `B100 (2)` = c(NA, NA, "06:00:00", "07:00:00", NA),
    D200 = c("02:00:00", NA, NA, NA, "06:00:00"),
    D300 = c(NA, NA, "04:00:00", "05:00:00", "07:00:00"),
    check.names = FALSE)

And i want it to look like this. (Also i want to remove the NA's but this isnt a problem)

 df_desired <- data.frame(
    Prod.order = seq(123, 127),
    B100 = c("B100", "B100", "B100", "B100", "B100"),
    `B100 (2)` = c("", "", "B100 (2)", "B100 (2)", ""),
    D200 = c("D200", "", "", "", "D200"),
    D300 = c("", "", "D300", "D300", "D300"),
    check.names = FALSE)

It seems so simple but I havn't been able figure it out. p.s. it would be amazing if the solution fits in a dplyr pipline ;)

Thanks :D

You should add check.names = FALSE to your reproducible data; otherwise, the names of the variables in the data frame are checked to ensure that they are syntactically valid. — Darren Tsai
– Darren Tsai, Commented Aug 26, 2022 at 3:01
Hi, if any answers have solved your question, you could consider accepting one of them you prefer by clicking the check-mark. Thanks! — Darren Tsai
– Darren Tsai, Commented Aug 27, 2022 at 5:39

Darren Tsai · Accepted Answer · 2022-08-26 02:59:48Z

3

A base solution with names(df)[col(df)] replicating the column names:

df_current[-1] <- ifelse(is.na(df_current), '', names(df_current)[col(df_current)])[, -1]
df_current

#   Prod.order B100 B100 (2) D200 D300
# 1        123 B100          D200
# 2        124 B100
# 3        125 B100 B100 (2)      D300
# 4        126 B100 B100 (2)      D300
# 5        127 B100          D200 D300

or with t():

df_current[-1] <- t(ifelse(t(is.na(df_current)), '', names(df_current)))[, -1]
df_current

answered Aug 26, 2022 at 2:59

Darren Tsai

36.6k6 gold badges27 silver badges58 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

Onyambu · Accepted Answer · 2022-08-26 02:46:00Z

2

library(tidyverse)

df_current %>%
  mutate(across(where(is.character), ~ifelse(is.na(.x), '', cur_column())))

  Prod.order B100 B100..2. D200 D300
1        123 B100          D200     
2        124 B100                   
3        125 B100 B100..2.      D300
4        126 B100 B100..2.      D300
5        127 B100          D200 D300

df_current %>%
   pivot_longer(where(is.character)) %>%
   mutate(value = ifelse(is.na(value), '', name)) %>%
   pivot_wider()

# A tibble: 5 x 5
  Prod.order B100  B100..2.   D200   D300  
       <int> <chr> <chr>      <chr>  <chr> 
1        123 B100  ""         "D200" ""    
2        124 B100  ""         ""     ""    
3        125 B100  "B100..2." ""     "D300"
4        126 B100  "B100..2." ""     "D300"
5        127 B100  ""         "D200" "D300"

edited Aug 26, 2022 at 2:46

answered Aug 26, 2022 at 2:40

Onyambu

80.3k3 gold badges29 silver badges65 bronze badges

1 Comment

Roel Over a year ago

df_current %>% mutate(across(where(is.character), ~ifelse(is.na(.x), '', cur_column()))) worked perfectly!

Collectives™ on Stack Overflow

Replacing strings in dataframe with column name in R

2 Answers 2

Comments

1 Comment

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Comments

1 Comment

Your Answer

Sign up or log in

Post as a guest

Related