Accelerators and the target Constructs

Forum for the public review of the OpenMP 4.0 API Release Candidates. (Read Only)
Forum rules
This forum is now closed.
Posts: 5
Joined: Thu Jan 28, 2010 7:38 am

Accelerators and the target Constructs

Post by PGK »

I have a few thoughts and questions regarding the very interesting inclusion of accelerators within the second OpenMP 4.0 Public Release Candidate. First of all, I was surprised that there is no reference to the word "accelerator" in the document. I searched for that word due to the title of the earlier TR1; though I see the body of the TR1 also avoids the term. I nevertheless would have appreciated a flagged entry point for such a major new addition in OpenMP. I feel that a schematic diagram of a prototypical accelerator might also be needed. Examples too would be extremely valuable.

I understand OpenMP 4.0 must target a broad range of accelerators, but I should first say that I will most likely be working with (and thinking here of) GPUs.

The following C++ code shows a loop parallelised across a default accelerator device; say a GPU. I've kept the parallel and for constructs separate here for clarity (though I wonder if a new combined construct could help reduce the verbosity).

Code: Select all

#pragma omp target map(a[:4096])
#pragma omp teams
#pragma omp distribute
#pragma omp parallel
#pragma omp for
for (int i = 0; i < 4096; ++i)

My first question: Is the order of the parallel and distribute constructs significant? Is the position of the target construct significant?

If I don't use the "declare target" directive on a function which I then attempt to use within the scope of a target construct, should I receive an error, or might the implementation silently fall back to a host implementation?

Following on from the last question: if, say, a target device is not supported by the implementation, is there a way to tell that my target region did/will not run on the accelerator?

On page 46 of the RC2 it is stated for C/C++ "When the size of the array dimension is not known, the length must be specified explicitly." Would a C++11 std::array also be compatible here?

If I omit any use of the map clause, what will be the default map-type for variables referenced within a target region?

In Section 2.9.3 target update, there is reference to a "to" and "from" clauses. I can't find any reference to these clauses elsewhere. Shouldn't it be map that's used here?

Best Regards,