2

I want to parallelize the following code with OpenMP:

  for(i=0; i<n; i++)
    {
        int sum=0;
        for(j=0; j<m; j++)
        {
            sum += A[i][j]*x[j];
        }
        y[i]=sum
    }

Would it work if I just add #pragma omp parallel for at the top? Or are there other (better) ways?

1
  • 2
    Yes, #pragma omp parallel for for external loop is OK. Commented Nov 19, 2013 at 6:10

1 Answer 1

2

A #pragma omp parallel for for your external loop is fine, you should get your intended results using the code below:

#pragma omp parallel for private(i, j, sum) shared(y, n, m, A, x)
for(i=0; i<n; i++)
{
    int sum=0;
    for(j=0; j<m; j++)
    {
        sum += A[i][j]*x[j];
    }
    y[i]=sum
}

Note that in order to get any form of noticeable improvement your n variable is going to have to be very large. Small n values will actually degrade your performance substantially due to thread overhead and a phenomenon called False Sharing.

As an end note, If your final intention is to sum up y - you can make use of the OpenMP reduction clause.

Sign up to request clarification or add additional context in comments.

1 Comment

I doubt his final intention is to sum up y since this obviously is a dense matrix-vector multiplication code. MVM is known to be memory- rather than compute-bound. Any combination of n * m that leads to A[][] being larger than the last-level cache would result in poor to no parallel speed-up at all.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.