Postby cndcnd » Sun Dec 07, 2008 8:00 am


I compiled a program with openmp (GCC for windows) and ran it on a dual core laptop: surprisingly the execution speed with two threads is actually slightly inferior than with just one thread. Since the program works on large data arrays that cannot be arranged in the cache of the individual cores, I imagine that in this situation the bus speed becomes the bottleneck, but I cannot be sure. Is there a performance analysis tools that would tell me if the processors remain idle while waiting for the bus?



Postby ejd » Tue Dec 09, 2008 10:48 am

What are you measuring when you say that the execution speed is slightly inferior? I ask, because quite a few people that have posted on this forum have not been looking at elapsed time when they make this same comment. I also have to ask whether you have set omp_dynamic to "false" and set omp_num_threads to 2. If you have, then you are ahead of a lot of people.

Depending on the size of the data and how you are accessing it, it could indeed be a data transfer problem. A good performance tool should give you an idea of whether or not that is the case. Unfortunately, I am not familiar with the tools available for Windows and gcc. Have you looked at the various vendor's products. I am sure that you could find something to buy that would do the job. I am betting though, since you are using gcc, that you would prefer something free. You might be able to try one of the tools for a trial period and see if it provides you with the information you need. I also believe that there may be some tools available for non-commercial use that might fit your needs. Sorry I can't be more helpful then that. Maybe someone else reading this forum could be of help.
