0

I have the following dataframe:

df <- structure(list(country = c("US", "US", "US", "UK", "UK", "UK", 
"UK"), date = c("2020-01-01", "2020-01-02", "2020-01-03", "2020-01-04", 
"2020-01-05", "2020-01-06", "2020-01-07"), y = 1:7, treatment = c(0, 
1, 0, 0, 0, 1, 0)), class = "data.frame", row.names = c(NA, -7L
))

Here is the df:

  country date       y treatment
  US      2020-01-01 1    0
  US      2020-01-02 2    1
  US      2020-01-03 3    0
  UK      2020-01-01 4    0
  UK      2020-01-02 5    0
  UK      2020-01-03 6    1
  UK      2020-01-04 7    0

I need to create varible which will reflect relative time before and after treatment with zero for the date of treatment. So, in this case it should be equal to

relative_time = c(-1,0,1,-2,-1,0,1)

How can I create such a varible for each group of country?

4
  • 1
    The code for df is not the same as the data frame displayed. Please fix the question. Commented Jul 2, 2020 at 15:07
  • Exactly. If the country is missing, this is not possible. You need the country group. Commented Jul 2, 2020 at 15:14
  • Sorry, took the freedom to edit the post and make the reproducible example right. Commented Jul 2, 2020 at 15:20
  • One addition. This example implicitly assumes that there is only one treatment per country. Readers might wanna know that when adapting from this example. Commented Jul 2, 2020 at 15:29

2 Answers 2

0

Also dplyr, bit shorter

library(lubridate)

df %>%
    mutate(date = ymd(date)) %>%
    group_by(country) %>%
    mutate(time_to_treatment = date - date[treatment == 1])

outputs

  country date           y treatment time_to_treatment
  <chr>   <date>     <int>     <dbl> <drtn>           
1 US      2020-01-01     1         0 -1 days          
2 US      2020-01-02     2         1  0 days          
3 US      2020-01-03     3         0  1 days          
4 UK      2020-01-04     4         0 -2 days          
5 UK      2020-01-05     5         0 -1 days          
6 UK      2020-01-06     6         1  0 days          
7 UK      2020-01-07     7         0  1 days   
Sign up to request clarification or add additional context in comments.

2 Comments

lubridate needs to be loaded in order to use the function ymd
true. Will add it.
-1

The provided data is different from the displayed data, so the displayed data was used.

require(tidyverse)

df = read.table( text = 'country date       y treatment
                US      2020-01-01 1    0
                US      2020-01-02 2    1
                US      2020-01-03 3    0
                UK      2020-01-01 4    0
                UK      2020-01-02 5    0
                UK      2020-01-03 6    1
                UK      2020-01-04 7    0', header  = T) 

df <- 
df %>% group_by(country) %>% 
  mutate(rank_row = row_number()) %>% 
  ungroup %>% 
  mutate(treatment_row_number = if_else(treatment == 1, rank_row, 0L)) %>% 
  group_by(country) %>% 
  mutate(treatment_row_number = max(treatment_row_number)) %>% 
  ungroup %>% 
  mutate(relative_time = rank_row - treatment_row_number) %>% 
  select(-rank_row, -treatment_row_number) 
 
df


# A tibble: 7 x 5
  country date           y treatment relative_time
  <fct>   <fct>      <int>     <int>         <int>
1 US      2020-01-01     1         0            -1
2 US      2020-01-02     2         1             0
3 US      2020-01-03     3         0             1
4 UK      2020-01-01     4         0            -2
5 UK      2020-01-02     5         0            -1
6 UK      2020-01-03     6         1             0
7 UK      2020-01-04     7         0             1

1 Comment

Um, this is not reliable the dataframe must be sorted correctly in advance, otherwise you will get something wrong. And if you don't convert the date column into a date you cannot even sort by it. You might wann fix this in your answer.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.