0

I have a dataset which is like this.

library(dplyr)
set.seed(505)
tempdf1 <- 
  data.frame(
    y = rnorm(400),
    x1 = rnorm(400),
    x2 = rnorm(400),
    x3 = sample(1:5, 40, replace = TRUE),
    out = rep(1:2, each = 200),
    imp = rep(1:4, each = 20)
  )

There are two different outcomes as indicated in the column out, {1,2}.

I want to fit the model separately for these two outcomes.

So I use lapply like this.

tempdf2 <- split(tempdf1, tempdf1$out)

lapply(tempdf2, function(df) {
  df %>%
    group_by(imp) %>%
    do(model = lm(y ~ x1 + x2, data = .)) %>%
    as.list() %>%
    .[[-1]]
})

I have column x3. This is a factor, with 5 values, {1,2,3,4,5}. Now I want to fit the same model above (2 different outcomes, 2 seperate models lapply(tempdf2, function(df) { df %>% group_by(imp) %>%....

When out = 1

    Case1) Exclude x3 value 5 and fit the model where x3 values {1,2,3,4}: out = 1

           lapply(tempdf2, 
            function(df) { 
             df %>%  
               group_by(imp) %>%....`  

     Case2) Exclude x3 value 4 and fit the model where x3 values {1,2,3,5}: out = 1

           lapply(tempdf2, 
            function(df) { 
             df %>%  
               group_by(imp) %>%....`  

     Case3) Exclude x3 value 3 and fit the model where x3 values {1,2,4,5}: out = 1

           lapply(tempdf2, 
            function(df) { 
             df %>%  
               group_by(imp) %>%....` 


     Case4) Exclude x3 value 2 and fit the model where x3 values {1,3,4,5}: out = 1

           lapply(tempdf2, 
            function(df) { 
             df %>%  
               group_by(imp) %>%....`  

     Case5) Exclude x3 value 1 and fit the model where x3 values {2,3,4,5}: out = 1

           lapply(tempdf2, 
            function(df) { 
             df %>%  
               group_by(imp) %>%....`  

Now repeat the above process where out=2

This is where I am struggling and I need help.

I dont know how to include a second inner lapply loop that will fit the above model only on subsets of x3 after excluding rows with one level of x3 , one at a time.

Also I like to know which model corresponds to which excluded rows of x3 and which out values {1,2}, so that I can keep track of the results. Thanks in advance for any help.

1 Answer 1

1

Instead of creating multiple loops and grouping, just create a list of every combination of Out, imp and X3 and then filter the original data frame and then model.

#find the values for out, imp and x3
out<-unique(tempdf1$out)
imp<-unique(tempdf1$imp)
x3<-sort(unique(tempdf1$x3))

#create df with every combination
models <- expand.grid(out, imp, x3)
names(models) <- c("out", "imp", "x3")

#loop through the combinations and name the list elements 
output <- lapply(1:nrow(models), function(i) {
   tempdf <- tempdf1 %>% filter(x3 != (6-models$x3[i]), imp==models$imp[i], out==models$out[i])
   lm(y ~ x1 + x2, data = tempdf)
} )
names(output) <-  paste("Out=", models$out, "Imp=", models$imp, "Case:",models$x3)
output

Notice the name of each list element has the model's conditions.

$`Out= 1 Imp= 1 Case: 1`

Call:
lm(formula = y ~ x1 + x2, data = tempdf)

Coefficients:
(Intercept)           x1           x2  
   -0.24430     -0.36825      0.01901  


$`Out= 2 Imp= 1 Case: 1`

Call:
lm(formula = y ~ x1 + x2, data = tempdf)

Coefficients:
(Intercept)           x1           x2  
   0.290079    -0.271359    -0.008404  
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.