Parallelizing for loop

Question

I want to parallelize the following code with OpenMP:

  for(i=0; i<n; i++)
    {
        int sum=0;
        for(j=0; j<m; j++)
        {
            sum += A[i][j]*x[j];
        }
        y[i]=sum
    }

Would it work if I just add #pragma omp parallel for at the top? Or are there other (better) ways?

Yes, #pragma omp parallel for for external loop is OK.

Alex F
– Alex F

2013-11-19 06:10:02 +00:00
Commented Nov 19, 2013 at 6:10 — Alex F
– Alex F, Commented Nov 19, 2013 at 6:10

Michael Aquilina · Accepted Answer · 2013-11-20 11:54:16Z

2

A #pragma omp parallel for for your external loop is fine, you should get your intended results using the code below:

#pragma omp parallel for private(i, j, sum) shared(y, n, m, A, x)
for(i=0; i<n; i++)
{
    int sum=0;
    for(j=0; j<m; j++)
    {
        sum += A[i][j]*x[j];
    }
    y[i]=sum
}

Note that in order to get any form of noticeable improvement your n variable is going to have to be very large. Small n values will actually degrade your performance substantially due to thread overhead and a phenomenon called False Sharing.

~~As an end note, If your final intention is to sum up y - you can make use of the OpenMP reduction clause.~~

edited Nov 20, 2013 at 11:54

answered Nov 19, 2013 at 11:18

Michael Aquilina

5,5705 gold badges36 silver badges38 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

Hristo Iliev Over a year ago

I doubt his final intention is to sum up y since this obviously is a dense matrix-vector multiplication code. MVM is known to be memory- rather than compute-bound. Any combination of n * m that leads to A[][] being larger than the last-level cache would result in poor to no parallel speed-up at all.

Collectives™ on Stack Overflow

Parallelizing for loop

1 Answer 1

1 Comment

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

1 Comment

Your Answer

Sign up or log in

Post as a guest

Related