openMP in nested loops

Use this forum to discuss the book: Using OpenMP - Portable Shared Memory Parallel Programming, by Barbara Chapman, Gabriele Jost and Ruud van der Pas Read the for book info and to download the examples. Post your feedback about the book and examples to this forum
Forum rules
The OpenMP Forums are now closed to new posts. Please visit Stack Overflow if you are in need of help:

openMP in nested loops

Post by acastanedam »

Dear all,

I am trying to optimize a fortran program and to this end I'm trying to use openMP directives of the intel fortran compiler.
I'm a newbie in openMP, so I've taken the most basic approach:

Code: Select all

program first_task
   implicit none
   integer :: ip,i,j
   integer, parameter :: npd=40,nv=3,nc=4,ngkmax=64
   complex(8),parameter :: one=(1.,0.),zero=(0.,0.) 
   complex(8),allocatable :: mat(:,:),sums(:),vaux(:),wfpw(:)
   complex(8) :: aux,wfvr,aux2


   call omp_set_nested(.true.)

   forall(i=1:ngkmax) wfpw(i)=cmplx(i)

!$OMP PARALLEL PRIVATE(aux,ip,i,j,wfvr) REDUCTION(+:aux2)
   do ip=1,npd
       call matrix(aux,ngkmax,mat)
       do i=1,nv
           do j=1,nc
               call zgemv('n',ngkmax,ngkmax,one,mat,ngkmax,wfpw,1,zero,vaux,1)
           end do
       end do
   end do

   print*, sum(sums)
end program first_task
where (of course, I don't use just diagonal matrices):

Code: Select all

subroutine matrix(fac,ndim,mat)
  implicit none
  integer,intent(in) ::ndim
  complex(8),intent(in) :: fac
  complex(8),intent(out) :: mat(:,:)
  integer :: i

  forall(i=1:ndim) mat(i,i)=fac
end subroutine matrix
However at the end, after compiling and run, I obtain very different results from those given the code without OMP directives; moreover they change from run to run.
Using intel analysis tools, I could see the problem has to do with the fact some variables (mat and vaux, I think) are read and written by several threads, but honestly I do not how to fixed it.

Thanks in advance for any help,

Arcesio Castañeda Medina
Max-Planck Institute for Microstructure Physics
Weinberg 2
06120 Halle

Posts: 808
Joined: Thu Jan 08, 2009 10:12 am
Location: EPCC, University of Edinburgh

Re: openMP in nested loops

Post by MarkB »

I think you just need to add mat and vaux to the private clause on the parallel region since they are assigned before they are used in each iteration: but note that you need a compiler which supports OpenMP 3.0 or later to allow privatisation of allocatable arrays. (Also aux2 can be private rather than reduction unless you need the value after the end of the parallel loop).