## Multiple nested loops and 2D arrays in OpenMP

General OpenMP discussion

### Multiple nested loops and 2D arrays in OpenMP

I am trying to optimize the following structure (see below) which is an algorithm for sharpening images. However I have not managed to improve it very much and I was wondering if the access to the 2D arrays could be done more efficiently in OpenMP?
Code: Select all
`nrows`
and
Code: Select all
`ncolumns`
are big numbers, however
Code: Select all
` rowMin,rowMax`
and
Code: Select all
`columnMin,columnMax`
are generally -1,1.

Code: Select all
`#pragma omp for schedule(dynamic,1)     for (int i=0; i < nrows; i++)      {        int rowMin = -min(depth, i);         int rowMax = min(depth,(hight-i-1));            for (int j=0; j < ncolumns; j++)        {          int columnMin = -min(depth, j);          int columnMax = min(depth,(width-j-1));          int count   = 0;               for (int irow = rowMin; irow <= rowMax; irow++)          {            for (int jcol=columnMin; jcol<=columnMax; jcol++)            {               count++;              for (int k=0; k< color; k++)              {                 result[i][j * color+k] += array[i+irow][(j+jcol) * color+k];              }             }          }          for (int k=0; k< color; k++)          {            result[i][j* color+k] -= array[i][j * color+k];            result[i][j* color+k] /= (count-1);          }        }      }`

I would appreciate any ideas to get better performance here. Is there a better pattern to access return and array?

Chyntia
luiceur

Posts: 3
Joined: Thu Nov 19, 2015 8:58 am

### Re: Multiple nested loops and 2D arrays in OpenMP

luiceur wrote: I have not managed to improve it very much

The basic structure (i.e. parallel loop over the first array dimension) is correct.
Is it the parallel speedup that is poor, or the sequential performance (or both)?
For large arrays, memory bandwidth is likely to be the limiting factor.
MarkB

Posts: 801
Joined: Thu Jan 08, 2009 10:12 am
Location: EPCC, University of Edinburgh

### Re: Multiple nested loops and 2D arrays in OpenMP

The parallel efficiency is about 71% on 24 cores.

I guess what I would like to know is if there is a way of optmizing the cache use when accessing the 2D arrays or this is the all I can do using OpenMP.
luiceur

Posts: 3
Joined: Thu Nov 19, 2015 8:58 am

### Re: Multiple nested loops and 2D arrays in OpenMP

It might be possible to improve the performance by doing some blocking of the outer loops. And the innermost loops may not vectorise very well as they are mostly of length 3 (I guess color =3 ?). But these are sequential optimisations not directly related to OpenMP.

How much memory do the arrays typically occupy? And are the 24 cores split across two sockets? If so, then the placement of data in memory could be an issue. But since the problem is likely to be memory (or L3 cache) bandwidth limited, 71% efficiency is maybe all you can hope for.
MarkB

Posts: 801
Joined: Thu Jan 08, 2009 10:12 am
Location: EPCC, University of Edinburgh