0

I am trying to run a for loop where I randomly subsample a dataset using sample_n command. I also want to name each new subsampled dataframe as "df1" "df2" "df3". Where the numbers correspond to i in the for loop. I know the way I wrote this code is wrong and why i am getting the error. How can I access "df" "i" in the for loop so that it reads as df1, df2, etc.? Happy to clarify if needed. Thanks!

for (i in 1:9){ print(get(paste("df", i, sep=""))) = sub %>% group_by(dietAandB) %>% sample_n(1) }

Error in print(get(paste("df", i, sep = ""))) = sub %>% group_by(dietAandB) %>% : target of assignment expands to non-language object

2 Answers 2

2

Instead of using get you could use assign.

Using some fake example data:

library(dplyr, warn=FALSE)

sub <- data.frame(
  dietAandB = LETTERS[1:2]
)

for (i in 1:2) { 
  assign(paste0("df", i), sub %>% group_by(dietAandB) %>% sample_n(1) |> ungroup()) 
}
df1
#> # A tibble: 2 × 1
#>   dietAandB
#>   <chr>    
#> 1 A        
#> 2 B
df2
#> # A tibble: 2 × 1
#>   dietAandB
#>   <chr>    
#> 1 A        
#> 2 B

But the more R-ish way to do this would be to use a list instead of creating single objects:

df <- list(); for (i in 1:2) { df[[i]] = sub %>% group_by(dietAandB) %>% sample_n(1) |> ungroup() }

df
#> [[1]]
#> # A tibble: 2 × 1
#>   dietAandB
#>   <chr>    
#> 1 A        
#> 2 B        
#> 
#> [[2]]
#> # A tibble: 2 × 1
#>   dietAandB
#>   <chr>    
#> 1 A        
#> 2 B

Or more concise to use lapply instead of a for loop

df <- lapply(1:2, function(x) sub %>% group_by(dietAandB) %>% sample_n(1) |> ungroup())

df
#> [[1]]
#> # A tibble: 2 × 1
#>   dietAandB
#>   <chr>    
#> 1 A        
#> 2 B        
#> 
#> [[2]]
#> # A tibble: 2 × 1
#>   dietAandB
#>   <chr>    
#> 1 A        
#> 2 B
Sign up to request clarification or add additional context in comments.

2 Comments

Thank you thank you thank you. This was beautiful. I really need to learn lapply and such so this was very helpful. Still don't understand exactly what it is doing but I will read the documentation. Quick question: ungroup is not necessary since sub stays unchanged or am I wrong?
Perhaps not really necessary. But your data will still be grouped. So in general I would go for ungroup() just as a good habit. And to avoid running into issues later on.
2

It depends on the sample size which is missing in your question. So, As an example I considered the mtcars dataset (32 rows) and sampling three subsamples of size 20 from the data:

library(dplyr)
for (i in 1:3) {
assign(paste0("df", i), sample_n(mtcars, 20))
}

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.