1

Given a source array

src = np.random.rand(320,240)

and an index array

idx = np.indices(src.shape).reshape(2, -1)
np.random.shuffle(idx.T)

we can map the linear index i in src to the 2-dimensional index idx[:,i] in a destination array dst via

dst = np.empty_like(src)
dst[tuple(idx)] = src.ravel()

This is discussed in Python: Mapping between two arrays with an index array

However, if this mapping is not 1-to-1, i.e., multiple entries in src map to the same entry in dst, according to the docs it is unspecified which of the source entries will be written to dst:

For advanced assignments, there is in general no guarantee for the iteration order. This means that if an element is set more than once, it is not possible to predict the final result.

If we are additionally given a precedence array

p = np.random.rand(*src.shape)

how can we use p to disambiguate this situation, i.e., write the entry with highest precedence according to p?

1 Answer 1

2

Here is a method using a sparse matrix for sorting (it has large overhead but scales better than argsort, presumably because it uses some radix sort like method (?)). Duplicate indices without precedence are explicitly set to -1. We make the destination array one cell too big, the surplus cell serving as trash can.

import numpy as np
from scipy import sparse

N = 2
idx = np.random.randint(0, N, (2, N, N))
prec = np.random.random((N, N))
src = np.arange(N*N).reshape(N, N)

def f_sparse(idx, prec, src):
    idx = np.ravel_multi_index(idx, src.shape).ravel()
    sp = sparse.csr_matrix((prec.ravel(), idx, np.arange(idx.size+1)),
                           (idx.size, idx.size)).tocsc()
    top = sp.indptr.argmax()
    mx = np.repeat(np.maximum.reduceat(sp.data, sp.indptr[:top]),
                   np.diff(sp.indptr[:top+1]))
    res = idx.copy()
    res[sp.indices[sp.data != mx]] = -1

    dst = np.full((idx.size + 1,), np.nan)
    dst[res] = src.ravel()
    return dst[:-1].reshape(src.shape)

print(idx)
print(prec)
print(src)
print(f_sparse(idx, prec, src))

Sample run:

[[[1 0]
  [1 0]]

 [[0 1]
  [0 0]]]
[[0.90995366 0.92095225]
 [0.60997092 0.84092015]]
[[0 1]
 [2 3]]
[[ 3.  1.]
 [ 0. nan]]
Sign up to request clarification or add additional context in comments.

2 Comments

Thanks! I haven't fully understood the code yet, but if it uses sorting, why doesn't it suffer from the same (wrong) left-to-right evaluation assumption as unutbu's answer?
@ASML It sorts by index, not by precedence; the purpose of the sorting is to have the clashing indices right next to each other, so they easily can be dealt with as a group (using ufunc.reduceat). As I wrote in the answer, duplicate indices which are not of maximal precedence are explicitly overwritten with -1, so upon assignment their corresponding values go to the last cell in the destination array. Now, we simply overallocate one cell and discard it in the end.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.