(Python) Mapping between two arrays with a precedence array

Question

Given a source array

src = np.random.rand(320,240)

and an index array

idx = np.indices(src.shape).reshape(2, -1)
np.random.shuffle(idx.T)

we can map the linear index i in src to the 2-dimensional index idx[:,i] in a destination array dst via

dst = np.empty_like(src)
dst[tuple(idx)] = src.ravel()

This is discussed in Python: Mapping between two arrays with an index array

However, if this mapping is not 1-to-1, i.e., multiple entries in src map to the same entry in dst, according to the docs it is unspecified which of the source entries will be written to dst:

For advanced assignments, there is in general no guarantee for the iteration order. This means that if an element is set more than once, it is not possible to predict the final result.

If we are additionally given a precedence array

p = np.random.rand(*src.shape)

how can we use p to disambiguate this situation, i.e., write the entry with highest precedence according to p?

Paul Panzer · Accepted Answer · 2018-05-07 06:03:46Z

2

Here is a method using a sparse matrix for sorting (it has large overhead but scales better than argsort, presumably because it uses some radix sort like method (?)). Duplicate indices without precedence are explicitly set to -1. We make the destination array one cell too big, the surplus cell serving as trash can.

import numpy as np
from scipy import sparse

N = 2
idx = np.random.randint(0, N, (2, N, N))
prec = np.random.random((N, N))
src = np.arange(N*N).reshape(N, N)

def f_sparse(idx, prec, src):
    idx = np.ravel_multi_index(idx, src.shape).ravel()
    sp = sparse.csr_matrix((prec.ravel(), idx, np.arange(idx.size+1)),
                           (idx.size, idx.size)).tocsc()
    top = sp.indptr.argmax()
    mx = np.repeat(np.maximum.reduceat(sp.data, sp.indptr[:top]),
                   np.diff(sp.indptr[:top+1]))
    res = idx.copy()
    res[sp.indices[sp.data != mx]] = -1

    dst = np.full((idx.size + 1,), np.nan)
    dst[res] = src.ravel()
    return dst[:-1].reshape(src.shape)

print(idx)
print(prec)
print(src)
print(f_sparse(idx, prec, src))

Sample run:

[[[1 0]
  [1 0]]

 [[0 1]
  [0 0]]]
[[0.90995366 0.92095225]
 [0.60997092 0.84092015]]
[[0 1]
 [2 3]]
[[ 3.  1.]
 [ 0. nan]]

answered May 7, 2018 at 6:03

Paul Panzer

53.3k3 gold badges59 silver badges103 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

ASML Over a year ago

Thanks! I haven't fully understood the code yet, but if it uses sorting, why doesn't it suffer from the same (wrong) left-to-right evaluation assumption as unutbu's answer?

Paul Panzer Over a year ago

@ASML It sorts by index, not by precedence; the purpose of the sorting is to have the clashing indices right next to each other, so they easily can be dealt with as a group (using ufunc.reduceat). As I wrote in the answer, duplicate indices which are not of maximal precedence are explicitly overwritten with -1, so upon assignment their corresponding values go to the last cell in the destination array. Now, we simply overallocate one cell and discard it in the end.

Collectives™ on Stack Overflow

(Python) Mapping between two arrays with a precedence array

1 Answer 1

2 Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

2 Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related