OpenMP with nested loops

Figure 5.26 on page 150 of the book shows an efficient way of parallelizing the inner loop. I don't understand why this approach does not create nested threads. With the #pragma omp parallel enclosing the entire loop, I would expect that each thread would run its own nested loop and when #pragma omp for is encountered, each thread would further split into multiple threads. Essentially, wouldn't we end up calculating the entire nested loop as many times as the number of threads? I would appreciate any clarification.


Re: OpenMP with nested loops

Hi John,

Worksharing directives such as #pragma omp for do not create threads: they share work out between the existing threads in the enclosing parallel region.
(Note that #pragma omp parallel for does create threads, as it is just shorthand for #pragma omp parallel followed by #pragma omp for - this is a common source of confusion).

