3.0 Task Implementation: Intel vs Sun

General OpenMP discussion

3.0 Task Implementation: Intel vs Sun

Postby branden901 » Sat Sep 13, 2008 4:34 pm

Hello, I was playing the new tasking feature using both Intel's compiler and Sun's compiler.

I tried the following test case,

#include <stdio.h>
#include <omp.h>
#pragma omp parallel
#pragma omp single
int i=1;
int n=omp_get_thread_num();
while (i) {
#pragma omp task firstprivate(i, n)
printf ("%d: %d %d %d\n", i, n, omp_get_thread_num(), omp_get_num_threads());

I found the behavior is different with Intel's compiler and Sun's compiler.

With Intel's compiler (11.0 Beta 044), the task seems to be always executed by the thread that creates the task, and I got
100: 0 0 4
101: 0 0 4
102: 0 0 4
103: 0 0 4
104: 0 0 4
105: 0 0 4

With Sun Studio Express July 2008, the tasks are more evenly distributed among the threads, and I got
100: 0 1 4
101: 2 1 4
102: 3 1 4
103: 0 1 4
104: 2 1 4
105: 0 1 4

Both were run on the same single-core machine and the OS is Ubuntu 8.04.

Sun's output seems more natural and is what I would expect. I think Intel's result is still legal, right? Is Intel doing some kind of optimization? Does it always execute tasks immediately? Sorry I do not know how to verify this.

Re: 3.0 Task Implementation: Intel vs Sun

Postby ejd » Mon Sep 15, 2008 7:49 am

The OpenMP Version 3.0 spec was only approved this year (May 2008). There were fairly major changes going into it up to the last minute. Because of this, the compiler vendors didn't really start implementing the new spec until it was approved. What you are looking at are two of the current Betas available and may not be what the final products look like. Both companies are still working on their implementations - looking at comments from cutomers and doing further "tweaking".

I work for Sun on the compilers. We have been trying to put out snapshots of the code we are working on in these Sun Studio Express "releases" so that we can get user feedback. What you see in the Sun Studio Express (dated July 2008), is where we were in the devlopment process of OpenMP Version 3.0 when this Express was put out. It is not complete (for example, we are not handling Fortran allocatables correctly yet) and still has bugs. The tasking support is there, but wasn't yet optimized for performance and I believe still has some default scoping problems.

In any case, while the output doesn't seem to be correct (the second and third columns seem to be reversed and don't match your comments), I can answer your question. In your example program, any thread can execute any task. That can mean, that one thread excutes all the tasks, or that multiple threads execute the tasks. There can be reasons to have the same thread execute the tasks - like re-using the variables already in the cache. There can also be reasons to have multiple threads do the work. Each case needs to be looked at to determine what is the best thing to do. I have not run your example, but I do know that a lot of effort is being expended in looking at getting the best performance possible.
Posts: 1025
Joined: Wed Jan 16, 2008 7:21 am

Return to Using OpenMP

Who is online

Users browsing this forum: Google [Bot] and 4 guests