I'm having a problem with parallelizing my program with openMP. The first for loop takes about 10 milliseconds, but the second takes about 45 seconds. I'm not sure if I'm just doing something wrong in the loop that is wasting time.
float A[M][M];
float B[M][M];
float C[M][M];
main(int argc, char** argv) {
float temp;
float real;
float error = 0;
int i,j,k;
double time_start;
double time_end;
double time_mid;
int n = 12;
omp_set_num_threads(n);
time_start = omp_get_wtime();
#pragma omp parallel default(shared) private(i,j,k,temp,real) reduction(+:error)
#pragma omp for
for (i=0; i<M; i++) {
for (j=0; j<M; j++) {
A[i][j] = ((i+1)*(j+1))/(float)M;
B[i][j] = (j+1)/(float)(i+1);
}
}
time_mid = omp_get_wtime();
#pragma omp for
for (i=0; i<M; i++) {
for (j=0; j<M; j++) {
temp = 0;
for (k=0; k<M; k++) {
temp += A[i][k]*B[k][j];
}
C[i][j] = temp;
real =(float) (i+1)*(j+1);
error = error + (float) fabs(temp-real)/real;
}
}
time_end = omp_get_wtime();
error = (100/(float)(M*M))*error;
printf("Percent error for C[][] is: %f\n", error);
printf("Time is: %f\n%f\n%f\n%f\n", time_end-time_start, time_start, time_mid, time_end);
return 0;
}
paralleldirective only comprises of the first loop. You're missing some curly brackets to create a block for your directive. So loop number 2 is sequential.