How to parallelize this fortran code by openmp

General OpenMP discussion

How to parallelize this fortran code by openmp

Dear all,

I have a fortran code to calculate particle motion. there are alway many big loops. Some guy suggest me to use openmp to speed it up. the following is the pseudo code.

do i=1,ni
a(i)=c1
c2=0
ks=lfirst(i)
ke=last(i)
do k=ks,ke
c2=c2+f(k)
c3=f1+f2
................
c4=f+f3
enddo
f(i)=c2+c3
enddo

It can be seen that the loop i in the code is independent, but loop k is dependent. All calculations mainly happen in loop k which is small, up to 8. The loop i is always big, up to 200,000. I only want to use openmp for loop i. Your hints would be greatly appreciated. Cheers,Gonski
gonski

Re: How to parallelize this fortran code by openmp

I will start with a simple question about "f". Is it a scalar or an array? You use it in two lines:

c2 = c2 + f(k)
c4 = f + f3
and I am not sure if it is a mistake or if c4 is an array.
ejd

Posts: 1025
Joined: Wed Jan 16, 2008 7:21 am

Re: How to parallelize this fortran code by openmp

Code: Select all
!\$omp parallel do
do i = 1,ni
...
enddo

Is that all you want?
lfm

Posts: 135
Joined: Sun Oct 21, 2007 4:58 pm
Location: OpenMP ARB

Re: How to parallelize this fortran code by openmp

I think he wants more than that. He wants to deal with all the dependencies. For example, variables ke and ks need to be private so he is going to have to add a private clause.

Code: Select all
!\$omp parallel do private(ke,ks)
do i = 1, ni
...
enddo

Then there are the others that need to be handled - c2, c3, and c4. That is why I started with a simple question about "f", so I could understand more about "f" and "c4".
ejd

Posts: 1025
Joined: Wed Jan 16, 2008 7:21 am

Re: How to parallelize this fortran code by openmp

Thank you very much.
In these two days, I read some documents on openmp. Based on my studies, I have changed my code as the style of the following code. Now all the calculations in a big loop are put into a subroutine like "pr" below. I test it on linux. It works. Interestingly, the following code get the same results with sequential and parallel runnings. Do you think it is ok or not? Another thing is that in my code there is some situation like k as follows . I hope k will sum some values in different threads when it meets a preset-condition . Therefore here k is treated as DEFAULT(SHARED). This way is correct, right?
Cheers,
Gonski

Program test

common/nn/k
open(1,file='test.txt')
n=10
k=1

c.....OpenMP : Start parallel loop section
c\$OMP PARALLEL DO DEFAULT(SHARED), PRIVATE(i,
c\$OMP* ),
c\$OMP* SCHEDULE(RUNTIME)

do i=1,n
call pr(i)
enddo

close(1)

end

subroutine pr(i)
common/nn/k
write(1,*)''
write(1,*)'1,i=,k=',i,k
if(mod(k,2)==0)k=k+1
write(1,*)'2,i=,k=',i,k

return
end
gonski

Re: How to parallelize this fortran code by openmp

The code is not doing anything interesting. You will note that k is never being incremented. If it were, you would have a race condition where multiple threads could write to k at the same time. You could do something like:
Code: Select all
Program test
open(1,file='test.txt')
n=10
k=1

!.....OpenMP : Start parallel loop section
!\$OMP PARALLEL DO DEFAULT(SHARED), PRIVATE(i),
!\$OMP* SCHEDULE(runtime) reduction(+:k)
do i=1,n
call pr(i, k)
enddo

close(1)

end

subroutine pr(i, k)
write(1,*)''
write(1,*)'1,i=,k=',i,k
if(mod(i,2)==0)k=k+1
write(1,*)'2,i=,k=',i,k

return
end
ejd

Posts: 1025
Joined: Wed Jan 16, 2008 7:21 am

Re: How to parallelize this fortran code by openmp

Great. It works. Thank you very much, ejd !
Now I have such a situation, Could you help tell which is correct?

Method 1

Program test
real a(100), b(100)
n=100
!.....OpenMP : Start parallel loop section
!\$OMP PARALLEL DO DEFAULT(SHARED), PRIVATE(i,a,b),
!\$OMP* SCHEDULE(runtime)
do i=1,n
a(i)=i+1
b(i)=i
enddo

end

-----------------------------------------------------------------------------------
Method 2

Program test
real a(100), b(100)
common/eva/a,b
n=100
!.....OpenMP : Start parallel loop section
!\$OMP PARALLEL DO DEFAULT(SHARED), PRIVATE(i),
!\$OMP* SCHEDULE(runtime)
do i=1,n
call evaluate(i)
enddo

end

subroutine evaluate(i)
real a(100), b(100)
common/eva/a,b
a(i)=i+1
b(i)=i
return
end
gonski

Re: How to parallelize this fortran code by openmp

Since these are only partial programs, it is hard to say what is "correct". What I can say is that they do very different things.

In Method 1, you have privatized the arrays "a" and "b". That means that each thread gets it's own copy and that only the portion of the array that thread is working on will be changed. The private copies will not persist past the end of the parallel region.

In Method 2, the arrays "a" and "b" are shared, so there is only one copy of them shared by all of the threads and the values will persist after the parallel region.

These two Methods would do the same thing if you didn't put "a" and "b" in the private clause in Method 1. Hopefully that answers your question.
ejd

Posts: 1025
Joined: Wed Jan 16, 2008 7:21 am

Re: How to parallelize this fortran code by openmp

This would be greatly helpful. Thx
gonski

Re: How to parallelize this fortran code by openmp

I was told that any scalar in a parallelized loop should be in the private declaration. This is really puzzling me after I did some experiments on windows.

I test the flowing code on a windows system. the results are
4 8 12
2 10 6

This fact is expected. It is found that no matter c is in the private declaration or not, I get the same results. another thing is that in this thread, edj has kindly reminded me of that the treatments of icnt and jcnt in the following code may make multiple threads compete to write the same variables. However, when I try to use reduction clause, I can not get above results.

Now my questions are

1) to privatize scalar in a parallelized loop is necessary or not? why?
2) do above issues depend on operation system? I guess in linux, I need consider edj's suggetion.
3) icnt and jcnt will be a counter in a code. It seems REDUCTION clause is not applicable to them (at least under windows). Is there any other clause available to handle this situations?

Regards,
Shibo

program test
use omp_lib

real a(100), b(100)

n=6
icnt=0
jcnt=0

!.....OpenMP : Start parallel loop section
!\$OMP PARALLEL DO DEFAULT(SHARED), PRIVATE(i,c),SCHEDULE(runtime)
do i=1,n
c=i*2
if(mod(i,2)==0)then
icnt=icnt+1
a(icnt)=c
endif

if(mod(i,2)==1)then
jcnt=jcnt+1
b(jcnt)=c
endif

enddo

print*,a(1:icnt)
print*,b(1:jcnt)

end program test
gonski

Next