Page 1 of 2

### How to parallelize this fortran code by openmp

Posted: Fri Jan 18, 2008 11:02 pm
Dear all,

I have a fortran code to calculate particle motion. there are alway many big loops. Some guy suggest me to use openmp to speed it up. the following is the pseudo code.

do i=1,ni
a(i)=c1
c2=0
ks=lfirst(i)
ke=last(i)
do k=ks,ke
c2=c2+f(k)
c3=f1+f2
................
c4=f+f3
enddo
f(i)=c2+c3
enddo

It can be seen that the loop i in the code is independent, but loop k is dependent. All calculations mainly happen in loop k which is small, up to 8. The loop i is always big, up to 200,000. I only want to use openmp for loop i. Your hints would be greatly appreciated. Cheers,Gonski

### Re: How to parallelize this fortran code by openmp

Posted: Sun Jan 20, 2008 7:30 am
I will start with a simple question about "f". Is it a scalar or an array? You use it in two lines:

c2 = c2 + f(k)
c4 = f + f3
and I am not sure if it is a mistake or if c4 is an array.

### Re: How to parallelize this fortran code by openmp

Posted: Sun Jan 20, 2008 10:11 am
Code: Select all
`!\$omp parallel dodo i = 1,ni...enddo`

Is that all you want?

### Re: How to parallelize this fortran code by openmp

Posted: Sun Jan 20, 2008 12:03 pm
I think he wants more than that. He wants to deal with all the dependencies. For example, variables ke and ks need to be private so he is going to have to add a private clause.

Code: Select all
`!\$omp parallel do private(ke,ks)do i = 1, ni...enddo`

Then there are the others that need to be handled - c2, c3, and c4. That is why I started with a simple question about "f", so I could understand more about "f" and "c4".

### Re: How to parallelize this fortran code by openmp

Posted: Sun Jan 20, 2008 2:37 pm
Thank you very much.
In these two days, I read some documents on openmp. Based on my studies, I have changed my code as the style of the following code. Now all the calculations in a big loop are put into a subroutine like "pr" below. I test it on linux. It works. Interestingly, the following code get the same results with sequential and parallel runnings. Do you think it is ok or not? Another thing is that in my code there is some situation like k as follows . I hope k will sum some values in different threads when it meets a preset-condition . Therefore here k is treated as DEFAULT(SHARED). This way is correct, right?
Cheers,
Gonski

Program test

common/nn/k
open(1,file='test.txt')
n=10
k=1

c.....OpenMP : Start parallel loop section
c\$OMP PARALLEL DO DEFAULT(SHARED), PRIVATE(i,
c\$OMP* ),
c\$OMP* SCHEDULE(RUNTIME)

do i=1,n
call pr(i)
enddo

close(1)

end

subroutine pr(i)
common/nn/k
write(1,*)''
write(1,*)'1,i=,k=',i,k
if(mod(k,2)==0)k=k+1
write(1,*)'2,i=,k=',i,k

return
end

### Re: How to parallelize this fortran code by openmp

Posted: Sun Jan 20, 2008 4:36 pm
The code is not doing anything interesting. You will note that k is never being incremented. If it were, you would have a race condition where multiple threads could write to k at the same time. You could do something like:
Code: Select all
`Program testopen(1,file='test.txt')n=10k=1!.....OpenMP : Start parallel loop section!\$OMP PARALLEL DO DEFAULT(SHARED), PRIVATE(i),!\$OMP* SCHEDULE(runtime) reduction(+:k)do i=1,ncall pr(i, k)enddoclose(1)endsubroutine pr(i, k)write(1,*)''write(1,*)'1,i=,k=',i,kif(mod(i,2)==0)k=k+1write(1,*)'2,i=,k=',i,kreturnend`

### Re: How to parallelize this fortran code by openmp

Posted: Sun Jan 20, 2008 4:57 pm
Great. It works. Thank you very much, ejd !
Now I have such a situation, Could you help tell which is correct?

Method 1

Program test
real a(100), b(100)
n=100
!.....OpenMP : Start parallel loop section
!\$OMP PARALLEL DO DEFAULT(SHARED), PRIVATE(i,a,b),
!\$OMP* SCHEDULE(runtime)
do i=1,n
a(i)=i+1
b(i)=i
enddo

end

-----------------------------------------------------------------------------------
Method 2

Program test
real a(100), b(100)
common/eva/a,b
n=100
!.....OpenMP : Start parallel loop section
!\$OMP PARALLEL DO DEFAULT(SHARED), PRIVATE(i),
!\$OMP* SCHEDULE(runtime)
do i=1,n
call evaluate(i)
enddo

end

subroutine evaluate(i)
real a(100), b(100)
common/eva/a,b
a(i)=i+1
b(i)=i
return
end

### Re: How to parallelize this fortran code by openmp

Posted: Mon Jan 21, 2008 7:49 am
Since these are only partial programs, it is hard to say what is "correct". What I can say is that they do very different things.

In Method 1, you have privatized the arrays "a" and "b". That means that each thread gets it's own copy and that only the portion of the array that thread is working on will be changed. The private copies will not persist past the end of the parallel region.

In Method 2, the arrays "a" and "b" are shared, so there is only one copy of them shared by all of the threads and the values will persist after the parallel region.

These two Methods would do the same thing if you didn't put "a" and "b" in the private clause in Method 1. Hopefully that answers your question.

### Re: How to parallelize this fortran code by openmp

Posted: Tue Jan 22, 2008 2:58 am
This would be greatly helpful. Thx

### Re: How to parallelize this fortran code by openmp

Posted: Thu Jan 24, 2008 4:00 am
I was told that any scalar in a parallelized loop should be in the private declaration. This is really puzzling me after I did some experiments on windows.

I test the flowing code on a windows system. the results are
4 8 12
2 10 6

This fact is expected. It is found that no matter c is in the private declaration or not, I get the same results. another thing is that in this thread, edj has kindly reminded me of that the treatments of icnt and jcnt in the following code may make multiple threads compete to write the same variables. However, when I try to use reduction clause, I can not get above results.

Now my questions are

1) to privatize scalar in a parallelized loop is necessary or not? why?
2) do above issues depend on operation system? I guess in linux, I need consider edj's suggetion.
3) icnt and jcnt will be a counter in a code. It seems REDUCTION clause is not applicable to them (at least under windows). Is there any other clause available to handle this situations?

Regards,
Shibo

program test
use omp_lib

real a(100), b(100)

n=6
icnt=0
jcnt=0

!.....OpenMP : Start parallel loop section
!\$OMP PARALLEL DO DEFAULT(SHARED), PRIVATE(i,c),SCHEDULE(runtime)
do i=1,n
c=i*2
if(mod(i,2)==0)then
icnt=icnt+1
a(icnt)=c
endif

if(mod(i,2)==1)then
jcnt=jcnt+1
b(jcnt)=c
endif

enddo

print*,a(1:icnt)
print*,b(1:jcnt)

end program test