Noob question about scope of "!$omp..." activity?

General OpenMP discussion

Noob question about scope of "!$omp..." activity?

Postby jb_astro » Sat Jun 23, 2018 7:23 pm

Hi

I'm trying to help my Astrophysics prof improve the performance of a FORTRAN program he runs. We were looking the at possibility that some parallelization might help. Prog consists of a main and several modules that get makefile-ed together. To identify which parts of the program were taking the longest I added a large number of lines of code with counter variables (to verify the # loops) but was mostly interested in a fair number "call cpu_time()" lines of code I used to pinpoint where most of the run time was happening, and I output all the elapsed times using write() statements. Using this, I narrowed it down to one particular module (subroutine), and then within that, there were about 5 "culprit" code lines that took 95% of the total run time.

So here's the actual question. When I tried parallel on just one single line (inside this one subroutine), that is, I literally added:

!$omp parallel
!$omp workshare

immediately before this one line, and:

!$omp end workshare
!$omp end parallel

right after the line. The code ran, BUT every single output that had been calculated using the "call cpu_time()" function calls was now reporting a huge elapsed time. The CPU is a Xeon E5-1620 which is quad core with hyperthreading , so effectively 8 cores. When I compare the new elapsed times, they actually come out to just about 8x the actual elapsed time. So obviously it is accumulating all the time spend in each of the cores and adding it all together. But many of these "call cpu_time()" calls are in the main, which I didn't parallelize at all, and to which I did not even add the line

use omp_lib

and NONE of the "call cpu_time()" calls were inside the "!$omp parallel ... !$omp end parallel" section in the one particular subroutine, so I'm confused. My understanding was that surrounding code with "!$omp parallel ... !$omp end parallel" was what defined the starting and stopping of threads being used in parallel. How can it be affecting ALL of my calls to cpu_time() in the main and different subs that do not even have parallelization invoked? Does compiling (ifort compiler btw) with the -qopenmp switch make ALL the code run parallel? Or is this just something peculiar to the call cpu_time() function itself (I have read several people who recommend other ways to time code that this function)? (I'll need to run this with and without OMP a lot to verify if it does in fact speed up, so I don't think I can use omp_get_wtime.)

Thanks

-John
jb_astro
 
Posts: 2
Joined: Mon Jun 18, 2018 6:59 pm

Re: Noob question about scope of "!$omp..." activity?

Postby MarkB » Sun Jun 24, 2018 10:37 pm

I think what is going here is that the OpenMP runtime is creating its threads at program start-up and keeping them running (and consuming CPU cycles) in between parallel regions. This is done to minimise the overhead of starting each parallel region. So throughout the whole program run, including all the sequential bits, all 8 threads are consuming CPU cycles, and that's what cpu_time() is reporting. You should switch to a wall time clock - system_clock() is the Fortran standard one if you don't want to use omp_get_wtime(), but make sure you use INTEGER*8 arguments to avoid wraparound problems. You might find this discussion useful: https://stackoverflow.com/questions/687 ... stem-clock

Parallelising your code a single line at a time may not work out very well. The overhead of an OpenMP parallel region is of the order of a few, to a few tens, of microseconds, so unless each instance of the line of code takes significantly longer that this you won't see much (if any) speedup.
MarkB
 
Posts: 768
Joined: Thu Jan 08, 2009 10:12 am
Location: EPCC, University of Edinburgh

Re: Noob question about scope of "!$omp..." activity?

Postby jb_astro » Mon Jun 25, 2018 5:36 pm

Thanks very much, Mark

I was getting concerned that this was the case, and your reply certainly supports the thought: this program (or at least this first part that I was tackling) seems like it may not be "valued-added" to try to employ parallel code. But good to know about the option to use system_clock() in lieu of cpu_time().

Thanks again! (for the reply, and the link)

-John
jb_astro
 
Posts: 2
Joined: Mon Jun 18, 2018 6:59 pm


Return to Using OpenMP

Who is online

Users browsing this forum: No registered users and 4 guests