The most straightforward way I can think of to do this, although perhaps not the most efficient (especially if your matrix is huge), is to convert your matrix to a one-dimensional array, and then have corresponding arrays for the partition group indices X and Y. You can then group by the partition group indices and finally restructure the matrix back into its original form.
For example, if your matrix is
>>> M1 = np.arange(25).reshape((5,5))
>>> M1
array([[ 0, 1, 2, 3, 4],
[ 5, 6, 7, 8, 9],
[10, 11, 12, 13, 14],
[15, 16, 17, 18, 19],
[20, 21, 22, 23, 24]])
and your partitions are
>>> def f(x):
... return np.array([1,1,1,2,2])[x]
>>> def g(y):
... return np.array([3,4,4,4,5])[y]
From that point, there are several ways to implement the reshaping and subsequent grouping. You can do it with Pandas, for instance, by constructing a DataFrame and using its stack() method to "stack" all the rows on top of each other in a single column, indexed by their original row and column indices.
>>> st = pd.DataFrame(M1).stack().to_frame('M1')
>>> st
M1
0 0 0
1 1
2 2
3 3
4 4
1 0 5
...
4 3 23
4 24
(I have truncated the output for readability, and I trust that you can evaluate the rest of these examples yourself if you want to see their output.) You can then add columns representing the partition group indices:
>>> st['X'] = f(st.index.get_level_values(0))
>>> st['Y'] = g(st.index.get_level_values(1))
Then you can group by those indices and apply your aggregation function of choice.
>>> stp = st.groupby(['X', 'Y']).agg(p)
You will have to define p (or find an existing definition) such that it takes a one-dimensional Numpy array and returns a single number. If you want to use something like sum(), you can just use st.groupby(...).sum() because Pandas has built-in support for that and a few other standard functions, but agg is general and works for any reduction function p you can provide.
Finally, the unstack() method will convert the DataFrame back into the properly 2D "matrix form", and then if you want you can use the as_matrix() method to turn it back into a pure Numpy array.
>>> M3 = stp.unstack().as_matrix()
>>> M3
array([[ 15, 63, 27],
[ 35, 117, 43]])
If you don't want to bring in Pandas, there are other libraries that do the same thing. You might look at numpy-groupies, for example. However I haven't found any library that does true two-dimensional grouping, which you might need if you are working with very large matrices, large enough that having an additional 2 or 3 copies of them would exhaust the available memory.
fandgonly work with scalar inputs? Ideally to usenumpyyou want to write these in a way that works with an array (could be 1d) of values, returning an array of matching size. Otherwise you are stuck with iterating, in one way or other, over elements ofM1. What do you hope to gain by skippingM2?