The OpenMP Forums are now closed to new posts. Please visit Stack Overflow if you are in need of help: https://stackoverflow.com/questions/tagged/openmp
There is an application having 2 level parallel regions. For small problem sizes, users are supposed to use only the first level since the second level doesn't scale as good as the first level.
Only experience user who solves very large problems may turn on the second level parallel.
Currently for small problem, users use OMP_NUM_THREADS=AA and run the application. Since OMP_NESTED by default is false, they will not be penalized by the second level parallel.
For large problems, they use OMP_NUM_THREADS=AA,BB (BB is usually small up to 4) and OMP_NESTED=TRUE to access both levels.
In 5.0, OMP_NESTED is deprecated. If someone keeps using the old setting for small problems what is the expected behaviour of the application? users are typically ignorant of specifications.
If it will like the current OMP_NESTED=TRUE behaviour, users will get AA also on the second level and in total AA*AA threads and thus the performance will be terrible.
I noticed in the spec that OMP_MAX_ACTIVE_LEVELS can do some control but the default behaviour is the maximal possible value. So it doesn't help the previous case.
So I would like to know what is the reason deprecating OMP_NESTED and what is the alternative? This needs to be articulated in the deprecation section.
On the other hand, I feel the current behaviour of OMP_NESTED=true with OMP_NUM_THREADS=AA yielding AA*AA is ill logical.
I prefer to activate parallel levels as many as the amount of numbers listed with OMP_NUM_THREADS. OMP_NUM_THREADS=AA uses just one level even if the binary supports 2 levels. OMP_NUM_THREADS=AA,BB enables both levels.
As we proceed with deprecating OMP_NESTED (planned for OpenMP 6.0), we will replace the feature with other means.
For instance, OMP_NUM_THREADS already accepts a list of thread numbers for different levels. Without also setting OMP_NESTED=true, you would still get no real nested threading for OpenMP 4.5 and OpenMP 5.0 implementations. This caused a lot of confusion in the past and that's why we made the decision to simplify OpenMP usage.
Having said that, with OpenMP 6.0 something like OMP_NUM_THREADS=AA,BB will be sufficient to indicate to the OpenMP implementation that the user wants nesting to be enabled. We are looking at the details about the conditions that should indicate nesting to the OpenMP implementation at this point. A future, post-5.0 technical report will contain more information about these conditions.
Does that answer your question?
My concern is OMP_NUM_THREADS=AA and the code support 2 level parallel.
What is the behaviour when OMP_NESTED is removed (I mean 6.0)?
If it is the current OMP_NESTED=true behaviour, I will get AA*AA on the second level and it is a disaster.
I would expect the runtime fully respect OMP_NUM_THREADS.
AA activates only one level.
AA,BB activates 2 levels.
If this is the case, deprecating OMP_NESTED is a good step moving forward.