I have a relatively simple loop where I'm calculating the net acceleration of a system of particles using a brute-force method.
I have a working OpenMP loop which loops over each particles and compares it to every other particles for an n^2 complexity here:
!$omp parallel do private(i) shared(bodyArray, n) default(none)
do i = 1, n
!acc is real(real64), dimension(3)
bodyArray(i)%acc = bodyArray(i)%calcNetAcc(i, bodyArray)
end do
which works just fine.
What I'm trying to do now is to reduce my calculation time by only computing the force on each body once using the fact that the force from F(a->b) = -F(b->a), reducing the number of interactions to calculate by half (n^2 / 2). Which I do in this loop:
call clearAcceleration(bodyArray) !zero out acceleration
!$omp parallel do private(i, j) shared(bodyArray, n) default(none)
do i = 1, n
do j = i, n
if ( i /= j .and. j > i) then
bodyArray(i)%acc = bodyArray(i)%acc + bodyArray(i)%accTo(bodyArray(j))
bodyArray(j)%acc = bodyArray(j)%acc - bodyArray(i)%acc
end if
end do
end do
But I'm having a lot of difficulty with this parallelizing this loop, I keep getting junk results. I think it has to do with this line:
bodyArray(j)%acc = bodyArray(j)%acc - bodyArray(i)%acc
and that the forces are not being added up properly with all the different 'j' writing to it. I've tried using the atomic statement, but that's not allowed on array variables. So then I tried critical, but that increases the time it takes by about 20, and still doesn't give correct results. I also tried adding an ordered statement, but then I just get NaN for all my results. Is there an easy fix to get this loop working with OpenMP?
Working code, it has a slight speed improvement but not the ~2x I was looking for.
!$omp parallel do private(i, j) shared(bodyArray, forces, n) default(none) schedule(guided)
do i = 1, n
do j = 1, i-1
forces(j, i)%vec = bodyArray(i)%accTo(bodyArray(j))
forces(i, j)%vec = -forces(j, i)%vec
end do
end do
!$omp parallel do private(i, j) shared(bodyArray, n, forces) schedule(static)
do i = 1, n
do j = 1, n
bodyArray(i)%acc = bodyArray(i)%acc + forces(j, i)%vec
end do
end do