2

I have the following dataframe with three columns - FIRM, YEAR and DUMMY (0,1). For every FIRM, I wanted to scan all years and identify the first case where the DUMMY value of 1 is repeated more than once (in consecutive rows). Then, I want to create a new column which contains 0, for all the years in which the DUMMY was 1 and contains -1,-2,-3 for the years before it, and 1,2,3 for the years after it.

------------------------------
| FIRM | YEAR | DUMMY| NEW_COL
------------------------------
|  A   | 2006 |   0  |   0   |
------------------------------
|  A   | 2007 |   1  |   0   |
------------------------------
|  A   | 2008 |   0  |   0   |
------------------------------
|  B   | 2006 |   0  |   0   |
------------------------------
|  B   | 2007 |   0  |  -1   |
------------------------------
|  B   | 2008 |   1  |   0   |
------------------------------
|  B   | 2009 |   1  |   0   |
------------------------------
|  B   | 2010 |   0  |   1   |
------------------------------
|  B   | 2011 |   0  |   2   |
------------------------------
|  B   | 2012 |   1  |   3   |
------------------------------
|  B   | 2013 |   1  |   4   |
------------------------------

1 Answer 1

1

a data.table solution.

I think the year 2006 for firm B should be -2 based on your description.

library(data.table)

dt <- fread(' FIRM  YEAR  DUMMY NEW_COL
 A  2006  0  0 
 A  2007  1  0 
 A  2008  0  0 
 B  2006  0  0 
 B  2007  0  -1 
 B  2008  1  0 
 B  2009  1  0 
 B  2010  0  1 
 B  2011  0  2 
 B  2012  1  3 
 B  2013  1  4 ')


dt[,c("flag","grp"):=.((.N>1) & (DUMMY==1),
                       .GRP),by=.(FIRM,rleid(DUMMY))]
dt
#>     FIRM YEAR DUMMY NEW_COL  flag grp
#>  1:    A 2006     0       0 FALSE   1
#>  2:    A 2007     1       0 FALSE   2
#>  3:    A 2008     0       0 FALSE   3
#>  4:    B 2006     0       0 FALSE   4
#>  5:    B 2007     0      -1 FALSE   4
#>  6:    B 2008     1       0  TRUE   5
#>  7:    B 2009     1       0  TRUE   5
#>  8:    B 2010     0       1 FALSE   6
#>  9:    B 2011     0       2 FALSE   6
#> 10:    B 2012     1       3  TRUE   7
#> 11:    B 2013     1       4  TRUE   7

dt[flag==TRUE,result:=fifelse(grp==min(grp),0,99),by=.(FIRM)]
dt
#>     FIRM YEAR DUMMY NEW_COL  flag grp result
#>  1:    A 2006     0       0 FALSE   1     NA
#>  2:    A 2007     1       0 FALSE   2     NA
#>  3:    A 2008     0       0 FALSE   3     NA
#>  4:    B 2006     0       0 FALSE   4     NA
#>  5:    B 2007     0      -1 FALSE   4     NA
#>  6:    B 2008     1       0  TRUE   5      0
#>  7:    B 2009     1       0  TRUE   5      0
#>  8:    B 2010     0       1 FALSE   6     NA
#>  9:    B 2011     0       2 FALSE   6     NA
#> 10:    B 2012     1       3  TRUE   7     99
#> 11:    B 2013     1       4  TRUE   7     99



dt[,result:=lapply(.SD,function(x){
  if (any(!is.na(x==0))){
    position_0_head <- head(which(x==0),1)
    position_0_tail <- tail(which(x==0),1)
    x[1:position_0_head] <- 0 - (YEAR[position_0_head]-YEAR[1:position_0_head])
    x[position_0_tail:length(x)] <- 0 + (YEAR[position_0_tail:length(x)]-YEAR[position_0_tail])
  } else{
    x <- 0
  }
  x
}),.SDcols="result",by=.(FIRM)]

dt[,.SD,.SDcols = !c("flag","grp")]
#>     FIRM YEAR DUMMY NEW_COL result
#>  1:    A 2006     0       0      0
#>  2:    A 2007     1       0      0
#>  3:    A 2008     0       0      0
#>  4:    B 2006     0       0     -2
#>  5:    B 2007     0      -1     -1
#>  6:    B 2008     1       0      0
#>  7:    B 2009     1       0      0
#>  8:    B 2010     0       1      1
#>  9:    B 2011     0       2      2
#> 10:    B 2012     1       3      3
#> 11:    B 2013     1       4      4

Created on 2020-04-25 by the reprex package (v0.3.0)

Sign up to request clarification or add additional context in comments.

1 Comment

Hi Frank, Thanks for working on this. Appreciate it.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.