This appears to me to be overly defensive. Threadprivate variables should be accessible in parallel regions inside target regions if nothing else.Section 2.9.2 target Construct
The effect of an access to a threadprivate variable in a target region is unspecified
I would like to argue that the "master" thread executing in the outer most level of the target region should have its own set of threadprivate variables in the same way as any other thread. If a new data environment is created (e.g. if the if() statement evaluates to true) this master thread will get a new private set of threadprivate variables. If it is executed on the host it will use the same threadprivate data as the thread that encountered the region.
Parallel regions on the target would then behave the same way as "regular" parallel regions. The encountering thread becomes the master thread of the created team.
TARGET UPDATE should also be allowed for threadprivate variables and would operate between the threadprivate storage of the thread that created the TARGET data environment (either explicitly via TARGET DATA or implicitly with a bare TARGET) and the thread in the TARGET region encountering the TARGET UPDATE.
One issue I guess is the initial state of the threadprivate data on the target and its association to the thread that encountered the TARGET construct. I would argue that any threadprivate data accessed in the TARGET region will have a default behaviour as if the variable was affected by MAP(ALLOC:---) The variables are accessible, but uninitialized. Different behaviour can be achieved by an explicit MAP for the variable.
This can be written in several ways, but one possible way to do it is to have wordings such as:
The OpenMP 4 standard mentions restrictions on threadprivate data in several places, so I guess that there is something special about these variables that I have not thought about. I think that my suggestions make the concept of TARGET environments easier to use as they get similar to "ordinary" environments. I think that this does not impose too much restrictions making it hard to implement on accelerators. Also, I don't think this is an incompatible change to the current standard.Code in a TARGET region is executed as if there is a master thread operating
in the data environment of the TARGET region. Data operated on in this region is the data
on the target either by the target actually executing the code or by the run-time propagating
the effect on the data on to the target. If encountering a parallel construct or similar in the
target region the effective master thread becomes thread 0 of the created team in a
similar way as when a host thread encounters a parallel construct.
If a new data environment is created the master thread of the TARGET region has its own set of threadprivate data that is separate
from the thread that encountered the TARGET region construct. The initial state of these variables are as if these were mentioned in a MAP clause with a mapping of ALLOC.