2

I'm having a problem with parallelizing my program with openMP. The first for loop takes about 10 milliseconds, but the second takes about 45 seconds. I'm not sure if I'm just doing something wrong in the loop that is wasting time.

float A[M][M];
float B[M][M];
float C[M][M];

main(int argc, char** argv) {
float temp;
float real;
float error = 0;
int i,j,k;
double time_start;
double time_end;
double time_mid;
int n  = 12;

omp_set_num_threads(n);
time_start = omp_get_wtime();


#pragma omp parallel default(shared) private(i,j,k,temp,real) reduction(+:error)
#pragma omp for
for (i=0; i<M; i++) {
        for (j=0; j<M; j++) {
                A[i][j] = ((i+1)*(j+1))/(float)M;
                B[i][j] = (j+1)/(float)(i+1);
        }
}

time_mid = omp_get_wtime();
#pragma omp for
for (i=0; i<M; i++) {
        for (j=0; j<M; j++) {
                temp = 0;
                for (k=0; k<M; k++) {
                        temp += A[i][k]*B[k][j];
                }
            C[i][j] = temp;
            real =(float) (i+1)*(j+1);
            error = error + (float) fabs(temp-real)/real;

}
}


time_end = omp_get_wtime();
error = (100/(float)(M*M))*error;

printf("Percent error for C[][] is: %f\n", error);
printf("Time is: %f\n%f\n%f\n%f\n", time_end-time_start, time_start, time_mid, time_end);

return 0;
}
2
  • 4
    Your parallel directive only comprises of the first loop. You're missing some curly brackets to create a block for your directive. So loop number 2 is sequential. Commented Oct 16, 2018 at 5:54
  • So simple. Thanks. After that it dropped from 45s to 5 s. Commented Oct 16, 2018 at 6:06

1 Answer 1

1

From OpenMP specifications (page 35, 2.1 Directive Format C/C++)
https://www.openmp.org/wp-content/uploads/openmp-4.5.pdf

An OpenMP executable directive applies to at most one succeeding statement, which must be a structured block.

The definition of a block in C++ is stmt.block

Therefore #pragma omp parallel default(shared) private(i,j,k,temp,real) reduction(+:error) will only apply to the first block (your first for loop)

The other loops are not in a '#pragma omp parallel' context.

Use #pragma omp parallel{} to enclose the second loop.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.