Unstable running time for a simple matrix multiplication pro

General OpenMP discussion

Unstable running time for a simple matrix multiplication pro

Postby timtimac » Mon Dec 08, 2008 9:46 pm

I notice there was a similar topic before, but nobody follow up and the reason and solution don't seem clear. I tried another example and have a similar problem: elapsed time not stable. I wrote a very straightforward matrix multiplication program (source code attached after the message), and compile it using GCC 4.2.4 (on Ubuntu) with the -fopenmp flag, when running it on a 2-way Intel Xeon (totally 8 cores) machine, its running time varies between 2 seconds and 15 seconds. I mean it is either around 2 seconds or around 15 seconds (not anything in between). And the serial code runs around 15 seconds.

I used 8 threads and static schedule for "parallel for loop". Does anybody have a idea why the running time is not stable? Would it have something to do with cache? Please help. Thanks.

================= begin of matrix-omp.c ======================
#include<stdio.h>
#include<stdlib.h>
#include<sys/time.h>

#define SIZE 1000

int main(void) {
int i,j,k;
float **A, **B, **C;
struct timeval start, finish;

/* init matrix A and B */
A = (float**) malloc(sizeof(float *)*SIZE);
B = (float**) malloc(sizeof(float *)*SIZE);
C = (float**) malloc(sizeof(float *)*SIZE);
for (i=0;i<SIZE;i++) {
A[i] = (float*) malloc(sizeof(float)*SIZE);
B[i] = (float*) malloc(sizeof(float)*SIZE);
C[i] = (float*) malloc(sizeof(float)*SIZE);
}
for (i=0; i<SIZE; i++)
for (j=0; j<SIZE; j++) {
A[i][j] = 1.0;
B[i][j] = 2.0;
C[i][j] = 0;
}
gettimeofday(&start, 0);
#pragma omp parallel for num_threads(8) private(j,k) schedule(static)
for (i=0; i<SIZE; i++)
for (j=0; j<SIZE; j++)
for (k=0; k<SIZE; k++)
C[i][j] += A[i][k] * B[k][j];
gettimeofday(&finish, 0);
printf("OpenMP : Elapsed time: %10.2f seconds\n",
(finish.tv_sec + finish.tv_usec*1e-6)-(start.tv_sec + start.tv_usec*1e-6));
free(A);
free(B);
free(C);
return EXIT_SUCCESS;
}
======================end==============================

Cheers,
Tim
timtimac
 

Re: Unstable running time for a simple matrix multiplication pro

Postby timtimac » Mon Dec 08, 2008 9:48 pm

The title was meant to be "Unstable running time for a simple matrix multiplication program" not "pro". It trims maybe when it is too long.
timtimac
 

Re: Unstable running time for a simple matrix multiplication pro

Postby timtimac » Wed Dec 10, 2008 5:37 pm

no body has the same problem before? Please help.
timtimac
 

Re: Unstable running time for a simple matrix multiplication pro

Postby Cuchulainn » Mon Dec 15, 2008 3:48 am

running time is not stable
what is this precisely?

There are some comments on this code:

for (i=0; i<SIZE; i++)
for (j=0; j<SIZE; j++)
for (k=0; k<SIZE; k++)
C[i][j] += A[i][k] * B[k][j];

1. It uses a combination of row major and column major addressing (i believe C is RM) which might be the cause of caching problems.
2. Instead of 2d arrays, consider them to 1d arrays (using offsets etc.) This should improve the performance although you need to test it.
3. Small remarks: ++i is bit faster? and why not use omp timing routines?
(4. standard loop optimisatioon techniques might be useful, see Wiki)
Cuchulainn
 


Return to Using OpenMP

Who is online

Users browsing this forum: Google [Bot] and 1 guest

cron