2

A grouped/longitudinal data frame takes the form

id <- c(1,1,2,2,2,3,3,3)
x <- c(0,0,1,1,1,1,1,1)
t1 <- c(1,0,1,1,0,0,1,0)
t2 <- c(2.1,1.5,1.7,2.0,2.0,2.8,2.1,2.1)
t3 <- c(21,18,23,26,27,25,31,22)

df <- data.frame(id, x, t1, t2, t3)

I assume column x is time-invariant, but t1, t2, and t3 are time-varying. For t1 and t2, I want to replace their values by their first row values per each id and repeating it by the number of occurrence of the id, while keeping the values of x and t3 unchanged. The expected is

  id x t1  t2 t3
1  1 0  1 2.1 21
2  1 0  1 2.1 18
3  2 1  1 1.7 23
4  2 1  1 1.7 26
5  2 1  1 1.7 27
6  3 1  0 2.8 25
7  3 1  0 2.8 31
8  3 1  0 2.8 22

To resolve this I filter the data by first row per each id using the dplyr package. Then, generate a column of the number of times each id is repeated (i.e., ntimes) and finally repeat each row by ntimes. My code below

df <- df %>%
      group_by(id) %>%
      filter(row_number() == 1) %>%
      ungroup()
df$ntimes <- c(2,3,3)
df <- as.data.frame(lapply(df, rep, df$ntimes))
df <- subset(df, select = -c(ntimes)) 

This results in altering the t3 values. Also, I find the use of ntimes inconvenient as I am dealing with a large data frame with too many rows and columns. Is there a handy way of going about this to get the expected outout? Help is greatly appreciated!

1 Answer 1

3

Using dplyr::across and dplyr::first you can do:

library(dplyr, warn = FALSE)

df |>
  mutate(across(c(t1, t2), first), .by = id)
#>   id x t1  t2 t3
#> 1  1 0  1 2.1 21
#> 2  1 0  1 2.1 18
#> 3  2 1  1 1.7 23
#> 4  2 1  1 1.7 26
#> 5  2 1  1 1.7 27
#> 6  3 1  0 2.8 25
#> 7  3 1  0 2.8 31
#> 8  3 1  0 2.8 22
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.