Calling a Subroutine -> Program does not work

General OpenMP discussion

Calling a Subroutine -> Program does not work

Postby FDSUser » Thu Nov 20, 2008 6:19 am

All,

the code attached runs, if I compile without OpenMP, with OpenMP-compilation, an error occured by running a subroutine. If anybody knows why, this would be great. I use the Intel Compiler 11 on Ubuntu 8.04
Thanks in advance
Chris

Here is the code for the 2 routines:
Code: Select all
IF (MIXTURE_FRACTION) THEN
!$OMP PARALLEL DO COLLAPSE(3) PRIVATE(K,J,I,ITMP,Z_VECTOR,CP_SUM,CP_MF,N)
   DO K=1,KBAR
      DO J=1,JBAR
         DO I=1,IBAR
            !IF (SOLID(CELL_INDEX(I,J,K))) CYCLE
            OpenMP_DIVG_005: IF (SOLID(CELL_INDEX(I,J,K))) THEN
               CALL DO_NOTHING('DIVG_PART1_MIXTURE_FRACTION_RTRM')
            ELSE !OpenMP
               ITMP = 0.1_EB*TMP(I,J,K)
               Z_VECTOR = YYP(I,J,K,I_Z_MIN:I_Z_MAX)
               CALL GET_CP(Z_VECTOR,Y_SUM(I,J,K),CP_MF,ITMP) !!! -> THIS Subroutine is called, than inside the subroutine the program stops with OpenMP, it runs without OpenMP
               IF (N_SPECIES > (I_Z_MAX-I_Z_MIN+1)) THEN
                  CP_SUM = 0._EB
                  DO N=1,N_SPECIES
                     IF (SPECIES(N)%MODE/=MIXTURE_FRACTION_SPECIES) &
                     CP_SUM = CP_SUM + YYP(I,J,K,N)*SPECIES(N)%CP(ITMP)
                  END DO
                  CP_MF = CP_SUM + (1._EB-Y_SUM(I,J,K))*CP_MF
               ENDIF
               RTRM(I,J,K) = R_PBAR(K,PRESSURE_ZONE(I,J,K))*RSUM(I,J,K)/CP_MF
               DP(I,J,K) = RTRM(I,J,K)*DP(I,J,K)
            ENDIF OpenMP_DIVG_005
         ENDDO
      ENDDO
   ENDDO
!!$OMP END PARALLEL DO


This is the code of the called subroutine GET_CP, the line where the code stops by using OpenMP is marked, without OpenMP there are no problems:

Code: Select all
SUBROUTINE GET_CP(Z_IN,YY_SUM,CP_MF,ITMP)

INTEGER, INTENT(IN) :: ITMP
REAL(EB), INTENT(IN) :: Z_IN(1:I_Z_MAX - I_Z_MIN + 1),YY_SUM
REAL(EB) ::Z(1:I_Z_MAX - I_Z_MIN + 1),CP_MF,OMYYSUM

IF (YY_SUM >=1._EB) THEN
   CP_MF = SPECIES(0)%CP(ITMP)
   RETURN
ELSE
   OMYYSUM = 1._EB-MAX(0._EB,YY_SUM)
   Z = MAX(0._EB,MIN(1._EB,Z_IN))/OMYYSUM  !----> THIS is the line where the code stops
ENDIF

CP_MF = (Z2CP_C(ITMP) + DOT_PRODUCT(Z2CP(ITMP,:),Z))/(Y_MF_SUM_C + DOT_PRODUCT(Y_MF_SUM_Z,Z))

END SUBROUTINE GET_CP
FDSUser
 
Posts: 17
Joined: Sat Nov 15, 2008 8:54 pm

Re: Calling a Subroutine -> Program does not work

Postby ejd » Thu Nov 20, 2008 6:46 am

You don't say how it stops. You might want to print out the variable values just before the line and see what they are. For example, I see that variable OMYYSUM is calculated just before being used on the line where the program stops. I have no idea from the code you have shown whether the value of variable YYSUM could be 1.0, in which case OMYYSUM would be 0.0 and it stop because you are trying to do a divide by zero. Another possibility is that Z_IN could be zero. Another thing would be to try the parallel do without the collapse clause and see if the program works correctly then.
ejd
 
Posts: 1025
Joined: Wed Jan 16, 2008 7:21 am

Re: Calling a Subroutine -> Program does not work

Postby FDSUser » Thu Nov 20, 2008 8:56 am

Thanks for your reply.

I tested the code without collapse(3), the same problem. The error I get is

Code: Select all
forrtl: severe (174): SIGSEGV, segmentation fault occurred
Image              PC        Routine            Line        Source             
.                  4001C410  Unknown               Unknown  Unknown
libiomp5.so        400BBE12  Unknown               Unknown  Unknown
libpthread.so.0    400D64FB  Unknown               Unknown  Unknown
libc.so.6          401C3E5E  Unknown               Unknown  Unknown


The code stops with the above message if the line
Code: Select all
Z = MAX(0._EB,MIN(1._EB,Z_IN))/OMYYSUM

is entered as written in the first post. I checked the values of the variables, Z_IN has a size of 2, the value are 0.00000 and OMYYSUM has the value 1, so no division by 0.
Could this be a compiler problem? I compile with

ifort -c -O0 -vec_report0 -openmp -FR -auto -WB -traceback -g -fpe0 -fltconsistency

I will also ask at the INTEL compiler forum, maybe they know this problem. As written in the first post, if I remove the OpenMP directives, no problem with the code occures.
FDSUser
 
Posts: 17
Joined: Sat Nov 15, 2008 8:54 pm

Re: Calling a Subroutine -> Program does not work

Postby ejd » Thu Nov 20, 2008 9:12 am

It is always possible that there is a compiler problem. However, it is more likely that it is a user error. What happens when you use one thread? How big is your stack? Try changing your stacksize to "unlimited".
ejd
 
Posts: 1025
Joined: Wed Jan 16, 2008 7:21 am

Re: Calling a Subroutine -> Program does not work

Postby FDSUser » Thu Nov 20, 2008 9:34 am

I have also used single and master and critical directives in the do-loop (with and without collapse), the same error occurs. My stacksize is unlimited (2048 Mbyte) but this is not the problem, the programm uses with a very small testcase "only" 100 MB
FDSUser
 
Posts: 17
Joined: Sat Nov 15, 2008 8:54 pm

Re: Calling a Subroutine -> Program does not work

Postby ejd » Thu Nov 20, 2008 10:32 am

A segment fault is quite often caused by an array going out of bounds. I don't see how the variable z_vector is declared in the main routine. Make sure that the size of z_vector agrees with the size you have for z_in in the subroutine. Other than that, I have tried modifying your partial program to work and used the Sun Studio compiler and I don't see a problem. Have you tried the development version of the gfortran compiler that supports OpenMP? It might give you a better idea of whether or not it is a compiler problem.
ejd
 
Posts: 1025
Joined: Wed Jan 16, 2008 7:21 am

Re: Calling a Subroutine -> Program does not work

Postby FDSUser » Fri Nov 21, 2008 5:56 am

It seems to be a bug in the INTEL compiler. I compiled as suggested with the SUN Studio Express compiler, and the program runs without any problem. Thanks for your help!
FDSUser
 
Posts: 17
Joined: Sat Nov 15, 2008 8:54 pm

Re: Calling a Subroutine -> Program does not work

Postby FDSUser » Sun Nov 23, 2008 4:42 am

All, the problem is identified:

The variable Z_VECTOR is allocated in another module. It is not allocated in the module where the do-loops are placed. So as a simple example it is:
Code: Select all
MODULE module_1
.... some code
ALLOCATE(Z_VECTOR(1:I_Z_MAX-I_Z_MIN+1))
.... some code
END MODULE module_1

Then the Z_VECTOR is called in another module
Code: Select all
MODULE module_2
USE module_1
!$OMP PARALLEL DO PRIVATE(.....,Z_VECTOR,....)
DO
   ... some code
   Z_VECTOR = YYP(I,J,K,...)
   ... some code
ENDDO
!$OMP END PARALLEL DO
END MODULE module_2

The YYP value is shared, so the access of this variable is no problem. If Z_VECTOR would be allocated in the module_2, than the code is working with the INTEL compiler.
Now the final question: Is this problem described in the OpenMP 3.0 Specs? I found only example A.30.2.f, but this is different, because in my case Z_VECTOR is only one time allocated. It would be great, if anybody could clearify it for me, because than I have to made more changes in the code.
Best regards
Chris
FDSUser
 
Posts: 17
Joined: Sat Nov 15, 2008 8:54 pm

Re: Calling a Subroutine -> Program does not work

Postby ejd » Sun Nov 23, 2008 12:53 pm

It is about where z_vector is defined and not where it is allocated. For example, the following should work (using OpenMP V3.0):
Code: Select all
MODULE module_1
  integer, parameter :: I_Z_MIN = 2, I_Z_MAX = 5
  integer, allocatable :: z_vector(:)
  contains
    subroutine alloc()
        ALLOCATE(Z_VECTOR(1:I_Z_MAX-I_Z_MIN+1))
    end subroutine all
END MODULE module_1

MODULE module_2
  contains
    subroutine sub()
      USE module_1

      !$OMP PARALLEL DO PRIVATE(Z_VECTOR) num_threads(2)
      DO I = 1, 2
        print *, "loc(z_vector):", loc(z_vector)
      ENDDO
      !$OMP END PARALLEL DO
    end subroutine sub
END MODULE module_2

program test
  use module_1
  use module_2

  call alloc()
  call sub()
end program

If you declared z_vector in module_1, then it is global. Then using it in module_2, it will see the global. However, the private clause should give you a private copy of the storage.

So where is the declaration in your code for z_vector??
ejd
 
Posts: 1025
Joined: Wed Jan 16, 2008 7:21 am

Re: Calling a Subroutine -> Program does not work

Postby FDSUser » Mon Nov 24, 2008 7:43 am

Sorry for my incomplete post.
I have in my code the following structure, each MODULE is in a seperate file:
File 1:
Code: Select all
MODULE module_1
REAL, ALLOCATABLE, DIMENSION(:) :: Z_VECTOR
END MODULE module_1

File 2:
Code: Select all
MODULE module_2
USE module_1

CONTAINS
SUBROUTINE ALLOC_Z_VECTOR
Calculate I_Z_MAX and I_Z_MIN
ALLOCATE(Z_VECTOR(1:I_Z_MAX-I_Z_MIN+1))
END SUBROUTINE ALLOC_Z_VECTOR
END MODULE module_2

File 3, where Z_VECTOR is used by the calculation
Code: Select all
MODULE module_3
USE module_1

CONTAINS
SUBROUTINE CALCULATE_WITH_Z_VECTOR
!$OMP PARALLEL DO PRIVATE(.....,Z_VECTOR,....)
DO
   ... some code
   Z_VECTOR = YYP(I,J,K,...)
   ... some code
ENDDO
!$OMP END PARALLEL DO
END SUBROUTINE CALCULATE_WITH_Z_VECTOR
END MODULE module_3

The program itself has a structure like
Code: Select all
PROGRAM TEST
USE module_1
USE module_2
USE module_3

CALL ALLOC_Z_VECTOR
DO until time reached
   CALL CALCULATE_WITH_Z_VECTOR
ENDDO
END PROGRAM TEST

So this is the complete structure of the code. From the Intel-Fortran-Compile Forum I got an answer like:
>>The Z_VECTOR variable is correctly allocated in another module, -> based on my post
Then Z_VECTOR cannot be private (unless it is also automatic) -> answer

The complete Post can be seen at http://software.intel.com/en-us/forums/ ... pic/62028/ if this helps. If I create Z_VECTOR_NEW in the subroutine CALCULATE_WITH_Z_VECTOR in module_3 and use Z_VECTOR_NEW, than it works, but not with the actual configuration, where it is allocated in another module.
FDSUser
 
Posts: 17
Joined: Sat Nov 15, 2008 8:54 pm

Next

Return to Using OpenMP

Who is online

Users browsing this forum: No registered users and 1 guest

cron