1

I'm somewhat new to MPI in Fortran. I have a MPI code where each processor does Ns simulations. So at then end I should have (mysize x Ns x 2) results, because I create for each simulation and each proc a 2-D array called PHI and a second array PHI^2 corresponding to squaring each value of PHI.

Then, after all the simulations in Ns, I generate for each proc a PHI_AVG_RANK array that is basically SUM(PHI)/Ns and a PHI^2_AVG_RANK for the PHI^2, similarly.

I want to send all of the resulting matrices PHI_AVG_RANK that are coming from each processor (a total of mysize matrices) to a mother processor through a reduction sum, to then do the average again over mysize, both for the PHI_AVG_RANK and the PHI**2_AVG_RANK. The reason for doing this is that I want to compute the RMS matrix over all (mysize x Ns) realizations, that is, sqrt(SUM(PHI^2_AVG_RANK)/mysize - (SUM(PHI_AVG_RANK)/mysize)^2), and then save it to a txt.

To do so, which datatype can be used? Contiguous, Vector, or Subarray? Is Reduce the best call to be done here?

This is my plan so far (piece of code after doing all the Ns simulations, i come up with a 100x100 matrix called phi_moyen_1_2 for each processor, and want to sum it all into the new matrix 100x100 called mean_2_025, then save it:

    call MPI_BARRIER(MPI_COMM_WORLD,ierr)
    call MPI_TYPE_CONTIGUOUS(100,MPI_REAL,row,ierr)
    call MPI_TYPE_CONTIGUOUS(100,row,matrix,ierr)
    if (myrank==0) then
        call MPI_REDUCE(phi_moyen1_2,mean_2_025,1,matrix,MPI_SUM,0,MPI_COMM_WORLD,ierr)
        open(unit=1234, file='../results/PHI1/TESTE/teste.txt')
            do i=0,Nx-1
                write(ligne, *) mean_2_025(i,:)
                write(1234,'(a)') trim(ligne)
            end do
        close(unit=1234)
    endif

EDIT: After implementing the suggestion by @David Henty, we don't need to use CONTIGUOUS data types. We can actually do it straight withuot any intermediate data types and COMMIT clauses, since Fortran access each array element already. THen I did the followning:

if (myrank==0) then
        call MPI_REDUCE(phi_moyen1_2,mean_2_025,100*100,MPI_DOUBLE,MPI_SUM,0,MPI_COMM_WORLD,ierr)
        mean_2_025=mean_2_025/(mysize)
        write(*,*) mean_2_025(1,1)

But the program does not end (as if it was going through an infinite loop) and it does not print anything into the output file (it was supposed to show nprocs numbers for the 1st entry of the matrix mean_2_025 because of the write above). I've done some cpu_time at the end of the program and it shows nprocs cpu times so it means every processor gets through until the end?

EDIT, SOLVED: As @Vladimir F pointed out, the collective call REDUCE is made by all processors (even though it has a root processor inside the call). Thus it cannot be inside an if clause, causing the infinite loop (other processors were not able to acess the REDUCE).

Thanks to everyone.

2
  • 1
    A lot needs to be clarified. Please show your code, it is much better than describing things in words and trying to interpret what you mean. See How to Ask. Commented Jul 21, 2017 at 10:31
  • @VladimirF the code is very long but I've updated the question with the relevant part, that is, the reduce part Commented Jul 21, 2017 at 10:57

1 Answer 1

2

All you need to do is specify the type as MPI_REAL and the count as 100*100. For a reduce, the reduction is done separately for each element of the array so this will do exactly what you want, i.e. for all 100*100 values of i and j then, on rank 0, mean_2_025(i,j) will be the sum across all ranks of phi_moyen1_2(i,j).

call MPI_REDUCE(phi_moyen1_2,mean_2_025,100*100,MPI_REAL,MPI_SUM,0,MPI_COMM_WORLD,ierr)

To get the average, just divide by size. On a technical note, you don't need the barrier as MPI does all the synchronisation you require inside the collective.

Using datatypes is overcomplicating things here. You would need to commit them first, but more importantly the reduction operation won't know what to do with data of type "matrix" unless you tell it what to do by defining and registering your own reduction operation.

Sign up to request clarification or add additional context in comments.

4 Comments

Thanks, I did not know that. I've implemented that,the code was supposed to run in seconds but now it never finishes as if it was inside an infinite loop. I've done the following: if (myrank==0) then call MPI_REDUCE(phi_moyen1_2,mean_2_025,100*100,MPI_DOUBLE,MPI_SUM,0,MPI_COMM_WORLD,ierr) mean_2_025=mean_2_025/(mysize) write(,) mean_2_025(1,1) But is does not print anything into the output file. I've done some write calls before and after the REDUCE call and they show that each processor goes through the call but does nothing apparently.
Your use of REDUCE is wrong. All ranks in the communicator must call it. Study about collective operations. There can't be any if myrank==something.
@VladimirF thanks for the patience. Sorry for the ignorance. Now everything worked out fine. Thanks David as well.
Well spotted @VladimirF - I read the code too quickly and thought that the "if" was only around the IO statements.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.