I was trying to parallelize a code but it only deteriorated the performance. I wrote a Fortran code which runs several Monte Carlo integrations and then finds their mean.
implicit none
integer, parameter :: n=100
integer, parameter :: m=1000000
real, parameter :: pi=3.141592654
real MC,integ,x,y
integer tid,OMP_GET_THREAD_NUM,i,j,init,inside
read*,init
call OMP_SET_NUM_THREADS(init)
call random_seed()
!$OMP PARALLEL DO PRIVATE(J,X,Y,INSIDE,MC)
!$OMP& REDUCTION(+:INTEG)
do i=1,n
inside=0
do j=1,m
call random_number(x)
call random_number(y)
x=x*pi
y=y*2.0
if(y.le.x*sin(x))then
inside=inside+1
endif
enddo
MC=inside*2*pi/m
integ=integ+MC/n
enddo
!$OMP END PARALLEL DO
print*, integ
end
As I increase the number of threads, run-time increases drastically. I have looked for solutions for such problems and in most cases shared memory elements happen to be the problem but I cannot see how it is affecting my case.
I am running it on a 16 core processor using Intel Fortran compiler.
EDIT: The program after adding implicit none, declaring all variables and adding the private clause
inside? Where does it come from? Please make sure you useIMPLICIT NONE.insideis the number of random points which lie under the curvex*sin(x)where x goes from 0 to pi. The total area in whichmrandom points are generated is 2*pi so the area under the curve isinside*(2*pi)/mwhich is stored inMC