Final Comment Draft for the OpenMP 4.1 Specification -- 2.8.2 wrote: 19 * linear clause, see Section 188.8.131.52 on page 204
However latter is hard to apply to former because:Final Comment Draft for the OpenMP 4.1 Specification -- 184.108.40.206 wrote: 25 The linear clause declares one or more list items to be private to a SIMD lane and to have a
26 linear relationship with respect to the iteration space of a loop.
11 A list item that appears in a linear clause is subject to the private clause semantics described
12 in Section 220.127.116.11 on page 189 except as noted. In addition, if the val or uval modifier is used,
13 the value of the new list item on each iteration of the associated loop(s) corresponds to the value of
14 the original list item before entering the construct plus the logical number of the iteration times
19 The value corresponding to the sequentially last iteration of the associated loops is assigned to the original list
- It is unclear what private (18.104.22.168) means in context of omp declare simd construct: no provisions are made for this case in 22.214.171.124;
- There is no 'associated loop' for the omp declare simd construct and so no 'logical number of iterations' in this case;
- Since clause is applied to formal routine parameters it is unclear where privatization and initialization happens (at caller or callee side) and what 'original list item value' mean (there is no value associated with formal parameter).
I would suggest adding description of linear clause to 2.8.2 or at least clarifying applicability of 126.96.36.199 to 2.8.2. Also I suggest limit applicability of val/uval/ref to omp declare simd construct because of my 2 previous notes for this matter. Below are some ideas and discussion on how linear rules for 2.8.2 might look like.
1. For non-reference parameters it seems reasonable to assume that set of linear values for simd chunk (set of values of new list items as described in 188.8.131.52) is passed from caller through parameter marked linear. So the rule might look like
The last value rule rule is not applicable in this case: by-value parameters are not visible outside the functionParameter marked as linear gets values of logically first lane at entry of a function plus logical number of corresponding lane times linear step.
2. Linear(val)/linear(uval) reference parameters.
According to specification we have incoming reference which references the linear value. It seems reasonable to assume that set of references to linear values for simd chunk (set of references to new list items as described in 184.108.40.206) is passed from caller through parameter marked linear(val). In case of linear(uval) presumably address of new list item for logically first lane is passed. In both cases same rules as above looks applicable:
Reference to the first lane is enough to apply this rule, thus linear(uval) seems reasonable optimization.Parameter marked as linear(val)/linear(uval) gets values of logically first lane at entry of a function plus logical number of corresponding lane times linear step.
There is a case for value-out for by-reference parameters. Since incoming reference already represent set of linear values the the original last value rule is inapplicable: entire set of values is updated.Here is the difference between linear(val) and linear(uval) comes to play, because in latter case we don't have all references for natural update. Something like this should describe desired behavior:
Note: There is limited applicability of such updates due to necessary consistency with linearity rule from the caller loop for the referenced value. Which means that updated value cannot be used on the same logical iteration after the function call (in case if update is an increment of original value) or on the logically same iteration before the the update happens (if update is initialization for a logical iteration via reassignment from different linear value). Otherwise listed uses will observe values inconsistent with 220.127.116.11If val is specified corresponding locations may be naturally updated in each lane if linearity rule of above is preserved. If uval is specified than value corresponding to logically first lane is updated, other lanes get values of logically first lane plus logical number of lane times linear step."
According to specification (as far as I understand it) the value of underlying reference itself is linear. Note that referenced type should be integral or pointer type in C++. The wording for linear(ref) is the following:
It is totally unclear how subscript may be applied to reference to integral type (think int&) and applying it to reference to pointer type will render unintended results as far as I understand the intent. The desired behavior (at least in C++) is to my understanding the following:Final Comment Draft for the OpenMP 4.1 Specification -- 18.104.22.168 wrote: 15 If the ref modifier is used, the value of the new list item on each iteration of the
16 associated loop(s) corresponds to the value of the variable resulting from applying the linear-step
17 times the logical number of the iteration as a subscript to the original list item
For omp declare simd the specification is done for by-reference parameters (or dummy arguments in Fortran). Underlying references of these parameters are linear I would suggest expressing this explicitly:If the ref modifier is used, the value of the new list item on each iteration of the associated loop(s) corresponds to the value of the access by the reference formed as the reference from original list item incremented by the the linear-step times the logical number of the iteration
It worth noting additionally that since references to new list items are always passed for by-ref linear parameter it would be nice to have linear(ref) and linear(val) applied simultaneously to the same parameters for compilers allocating new list items for simd lanes for linear clasue back-to-back.If the ref modifier is used each lane gets its own address of the value for the lane to be read/updated as usual (without any clause). In addition address values are assumed to be formed as following: address of a value for the logically first lane at entry of a function plus logical number of lane times linear step. Plus here is an address arithmetic operation and subject to value type adjustments. Usual language-specific dereferencing is applied to access/update values by this set of addresses.