3

I'm running RStudio in Windows 7. I have written a master script that generates 57 new R scripts, each with commands to run a function based on two parameters:

vector1 <- c(1:19)
vector2 <- c(1:3)

First, the master script uses two for-loops (one using the index 'abc' for vector1, one using the index 'def' for vector2) to generate each of the 57 scripts in my working directory that take the following filename convention:

run_inference_<<vector1[abc]>>_<<vector2[def]>>.R

That part runs successfully - each of the 57 scripts is generated with the correct commands inside. My working directory now contains files run_inference_1_1.R, run_inference_1_2.R, etc.

The final thing I want to do is then run all 57 scripts from my master, and simultaneously. I've tried the following inside the for-loop:

system(paste0("Rscript run_inference_",abc, "_", def, ".R"),wait = F)

This does not work. However, if I open one of the 57 generated scripts and run it then I get the desired result from that script. This tells me the issue is within the system() command that I've written.

Each of the 57 scripts will not be computationally intensive (yet), and the test I want to do now should take 2 minutes on my PC. How can I edit my system() command to execute all 57 scripts simultaneously, please?

3
  • sapply(paste0("Rscript run_inference_",abc, "_", def, ".R"), system, wait = F) No comment on whether if generating and running 57 script is a good idea or not for your problem. Commented Sep 22, 2017 at 17:55
  • 1
    I do echo Vlo's unspoken concern that there is probably a better way to do this. Why do you want to write out the scripts instead of doing something like having a function that takes the required input and just running that function with the 57 different inputs you desire? Commented Sep 22, 2017 at 17:56
  • I have a well-resourced PC that should be able to handle very intensive jobs, and colleagues more familiar with parallel computing have told me this task should be doable given what's available. We shall see! Commented Sep 22, 2017 at 18:24

1 Answer 1

5

You don't do this by calling system once with a big script, unless the program you're running knows how to parallelise the script itself. You do this by calling system multiple times from different R processes.

scripts <- paste0("Rscript run_inference_", abc, "_", def, ".R")

# make lots of R processes, assuming the script to be called won't eat CPU
cl <- parallel::makeCluster(30)

parallel::parLapply(cl, scripts, function(script) system(script))
parallel::stopCluster(cl)
Sign up to request clarification or add additional context in comments.

1 Comment

Thank you. I'll experiment with this and see how it goes. Running big jobs like this is new to me, so this is a great 'baptism of fire', as it were.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.