1

Consider the following code:

#pragma omp parallel
for (int run = 0; run < 10; run++)
{
  std::vector<int> out;
  #pragma omp for
  for (int i = 0; i < 1'000'000; i++)
  {
    ...
  }
}

The intent is to spawn OpenMP threads only once before the outer iterations, which are supposed to run sequentially, and then schedule the inner iterations* multiple times on the same existing parallel threads.

However, the outer iteration is not marked #pragma omp single. Should I assume that it will do something like running the same outer code in all the threads at the same time, thereby a) concurrently modifying the run variable at ill-advised times, and b) having a possible conflict in the instantiation of out. Or are these implicitly private and not shared between threads because they are inside the parallel section?

How would this be executed in practice according to the specification, and what do the various implementations do in practice?

* My actual code is a bit more complex than this ; in practice I intend to put the inner for loop in an orphaned function that will be called by an outer function which handles setting up the parallel execution.

6
  • 3
    There is an openmp specification but thats not an official "standard". Perhaps its just about your wording, but the question sounds as if you are seeking for an answer in the c++ standard (which does not cover openmp) Commented yesterday
  • 3
    run and out variables are thread-private in your case, so there are no conflicts. Live demo: godbolt.org/z/MPocj1acq. The outer loop is run by each thread individually, and the inner loop is run in parallel by all the threads, which seems to be what you want. However, if it logically makes sense, I wouldn't be afraid of putting omp parallel for just before the inner loop. On modern system, creation of threads is fast, and OpenMP runtimes are able to reuse threads without creating them repeatedly under the hood. Commented yesterday
  • @463035818_is_not_an_ai I was indeed thinking of the OpenMP specification and not the C++ standard itself, poor wording from me. Edited question! Commented yesterday
  • @DanielLangr in your demo the first thread actually finishes both iterations before the second thread gets a chance to run, so in practice the outer iterations just run the exact same code as fast as they can in each thread independently, until they encounter omp for? How does each thread coordinate to determine what batch of inner iterations it should run, particularly when schedule is not static? Commented yesterday
  • Aren't you confusing something again with your wording? You say that #pragma omp parallel does not mark outer iterations. But actually it does, and it does not mark inner iterations. #pragma omp parallel defines the parallel region, and #pragma omp for distributes the iterations of your inner loop between the threads. Your use case looks correct and should not present any issues. Commented yesterday

0

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.