## Using Reduction in Fortran

General OpenMP discussion

### Using Reduction in Fortran

Hi, all
Recently, I tried to compare the speed using 'Reduction' in my simple OpenMP program to the one using 'workshare' & 'FORALL'. Three 3000*3000 matrice were assigned as B(i,j)=REAL(i), C(i,j)=REAL(i+j) and A(i,j)=REAL(i-j). Finally, all the A(i,j)*B(i,j)*C(i,j) were summed up . Program 1 used the Reduction while program 2 did with the intrinsic function SUM().
However, the results were different!! Idon't know why.... Would someone please tell me? Thanks a lot.

Program 1
Code: Select all
`    !\$OMP parallel        !\$OMP DO            DO j=1,N            !\$OMP parallel                !\$OMP do reduction(+:sumA)                    DO i=1,N                        B(i,j)=real(i)                        C(i,j)=real(i+j)                        A(i,j)=real(i-j)                        sumA=sumA+A(i,J)*B(i,j)*C(i,j)                    end do                !\$OMP end do            !\$OMP end parallel            end do        !\$OMP end do     !\$OMP end parallel      sumA=sumA/real(N*N)   `

The summation was 2250749749.91690
Program 2
Code: Select all
`    !\$OMP parallel        !\$OMP workshare            forall(i=1:N, j=1:N) B(i,j)=real(i)        !\$OMP end workshare nowait                !\$OMP workshare                   forall(i=1:N, j=1:N) C(i,j)=real(i+j)        !\$OMP end workshare        !\$OMP workshare                   forall(i=1:N, j=1:N) A(i,j)=real(i-j)        !\$OMP end workshare  !\$OMP end parallel      sumA=Sum(A*B*C)/real(N*N)`

The summation was 2250749750.08717
cyFeng

### Re: Using Reduction in Fortran

The reduction clause will add the variables together in an indeterminate order, thus resulting in different roundoff characteristics.
lfm

Posts: 135
Joined: Sun Oct 21, 2007 4:58 pm
Location: OpenMP ARB

### Re: Using Reduction in Fortran

lfm wrote:The reduction clause will add the variables together in an indeterminate order, thus resulting in different roundoff characteristics.

However, when the term A(i,j)*B(i,j) + C(i,j) were summed up, I always got the same result "753000.916666667". I am so confused...
cyFeng

### Re: Using Reduction in Fortran

Actually the problem is that your first program has an error. There is a race on SumA. If you change your program to something like:
Code: Select all
`!\$omp parallel    !\$omp do reduction(+:sumA)    Do j=1,N      !\$omp parallel        !\$omp do reduction(+:sumA)`

then you should see the expected results be the "same". You still might see a difference based on the precision of the variables and roundoff. However, I used double precision and it gave the exact same values for both program 1 and 2 with your value of N.
ejd

Posts: 1025
Joined: Wed Jan 16, 2008 7:21 am

### Re: Using Reduction in Fortran

ejd wrote:Actually the problem is that your first program has an error. There is a race on SumA. If you change your program to something like:
Code: Select all
`!\$omp parallel    !\$omp do reduction(+:sumA)    Do j=1,N      !\$omp parallel        !\$omp do reduction(+:sumA)`

then you should see the expected results be the "same". You still might see a difference based on the precision of the variables and roundoff. However, I used double precision and it gave the exact same values for both program 1 and 2 with your value of N.

Actually, I used the double precision for the matrice A, B, C and variable sumA. And I had tried that way you did above. It didn't work for my computer (Core 2 duo, 2.66 GHz).
If the program 1 I used was wrong, I can't explain why the both programs got the same results when the summation was done over A(i,j)*B(i,j)+C(i,j)...
cyFeng

### Re: Using Reduction in Fortran

> And I had tried that way you did above. It didn't work for my computer (Core 2 duo, 2.66 GHz).

I don't know what you mean when you say that it didn't work. How did it fail?

> If the program 1 I used was wrong, I can't explain why the both programs got the same results when the summation was done over A(i,j)*B(i,j)+C(i,j)...

I don't know what you mean by this.

What I do know is that in program 1, each thread at the outer level is sharing sumA. When a thread encounters the nested parallel region it creates a new team and each thread has a private copy of sumA it uses for the reduction. At the end of the inner parallel region, the reduction is finished by (in this case) adding the private values together and storing this value back into the shared sumA (thus giving a race condition on sumA at the outer level). The joys of parallel is that it might indeed work - depending on the timing.
ejd

Posts: 1025
Joined: Wed Jan 16, 2008 7:21 am

### Re: Using Reduction in Fortran

ejd wrote:> And I had tried that way you did above. It didn't work for my computer (Core 2 duo, 2.66 GHz).
I don't know what you mean when you say that it didn't work. How did it fail?.

I still got two different results on my computer (2250749749.91690 and 2250749750.08717) When I modified the Program 1 by adding the reduction(+:sumA) on the outer loop.

ejd wrote:> If the program 1 I used was wrong, I can't explain why the both programs got the same results when the summation was done over A(i,j)*B(i,j)+C(i,j)...
I don't know what you mean by this.

What I do know is that in program 1, each thread at the outer level is sharing sumA. When a thread encounters the nested parallel region it creates a new team and each thread has a private copy of sumA it uses for the reduction. At the end of the inner parallel region, the reduction is finished by (in this case) adding the private values together and storing this value back into the shared sumA (thus giving a race condition on sumA at the outer level). The joys of parallel is that it might indeed work - depending on the timing.

Sorry, let me make it clearly. I could get the same results from Program 1 and Program 2 when I didn't modify the Program 1 as your suggestion, but changed "sumA= sumA + A(i,j)*B(i,j) * C(i,j) " to "sumA=sumA + A(i,j)*B(i,j) + C(i,j)". Set N=3000, the result was 753000.91666667 for both programs. I wonder why the operation 'multiplication' and 'addition' could affect my results significantly.
cyFeng

### Re: Using Reduction in Fortran

In program 1, if you don't fix the race condition, then you can get differing results with a small value of N. For example, here is the output with N=40 and OMP_NESTED set to true:

Program 1:
sumA: 8806800.0
sumA/real(N*N): 5504.25

Program 2:
Sum(A*B*C): 8741200.0
sumA/real(N*N): 5463.25
By fixing the race condition in Program 1, then the two results are the same.

As for the numbers not being the same when N=3000, it is a roundoff problem. I incorrectly posted yesterday when I said that I could get the same numbers using your value of N. I can get the same values if I use double precision for N up to around 2400. After that, I had to switch to real*16 to get the same values. Using real*16 I get:

Program 1:
sumA: 2.025674774925E+16
sumA/real(N*N): 2.2507497499166666666666666666666665E+9

Program 2:
Sum(A*B*C): 2.025674774925E+16
sumA/real(N*N): 2.2507497499166666666666666666666665E+9

Which makes sense, since you only expect the following (in general):

Code: Select all
`Language type       Significant Digitsreal*4                     6-9real*8                    15-17real*16                   33-36`
ejd

Posts: 1025
Joined: Wed Jan 16, 2008 7:21 am

### Re: Using Reduction in Fortran

After using real*16, I could get the same result for both programs.