0

I have a set of survey design data for each quarter/year in RDs format on my disk. The data is like this:

Year  Quarter  Age
2010     1     27
2010     1     32 
2010     1     34
...

I'm using the function svymean(formula=~Age, na.rm = T, design = data20101) to estimate the mean of the age variable for each year/quarter file. I would like to run this more efficiently in a way that I could run the function and then save the results in one single data frame.

The output I'm looking for is to produce such a dataframe:

Year  Quarter  Mean_Age
2010     1       31.1
2010     1       32.4 
2010     1       30.9
2010     1       34.5
2010     2       36.3
2010     2       31.2
2010     2       30.8
2010     2       35.6
...

Regards,

1 Answer 1

1

lapply and package dplyr should do the work. Here is an example.

library(dplyr)

df1 <- data.frame(cbind("Year" = rep(2010, 6),
                        "Quarter" = c(1, 1, 1, 2, 2, 2),
                        "Age" = c(27, 32, 34, 30, 28, 21))
)

df2 <- data.frame(cbind("Year" = rep(2010, 6),
                        "Quarter" = c(1, 1, 1, 2, 2, 2),
                        "Age" = c(23, 19, 31, 41, 26, 23))
)

df.list <- list(df1, df2)

mean.list <- lapply(df.list, function(x){
  x %>%
    group_by(Year, Quarter) %>%
    summarize(Mean_Age = mean(Age, na.rm = TRUE))
})

mean.df <- do.call(rbind, mean.list)

mean.df

The result will be

# A tibble: 4 x 3
# Groups:   Year [1]
   Year Quarter Mean_Age
  <dbl>   <dbl>    <dbl>
1  2010       1     31  
2  2010       2     26.3
3  2010       1     24.3
4  2010       2     30 
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.