1

Given a 5D array, the objective is to calculate the difference between the extracted two arrays. For simplicity, say to measure the difference at the second position which can be denoted as bt and lf. The value from these two array can be extracted as follows:

arr[ep, bt, mt, bd, :] - arr[ep, lf, mt, bd, :]

Note that in the above, the index for the first (ep), third (mt) and fourth (bd) axes are the same for both of the arrays, with only the position index of second axis differing (bt and lf).

Based on this requirement, the following code was proposed, and pack under the function nested_for_loop:

import numpy as np
from joblib import Parallel, delayed
np.random.seed(0)

ub_lb_pair = np.tril_indices (5, -1)

arr = np.random.randn(3, 5, 4, 3, 2)
my_shape = arr.shape

def nested_for_loop():
    store_cval = np.full([my_shape[0], 10, my_shape[2], my_shape[3], my_shape[4]],
                         np.nan)  # preallocate
    for ep in range(0, my_shape[0]):
        for mt in range(0, my_shape[2]):
            for bd in range(0, my_shape[3]):
                for idx,(bt, lf) in enumerate(zip(ub_lb_pair[0], ub_lb_pair[1])):
                    store_cval[ep, idx, mt, bd, :] = arr[ep, bt, mt, bd, :] - \
                                                     arr[ep, lf, mt, bd, :]
    return store_cval


store_cval = nested_for_loop()

However, I would like to make the code much more compact and efficient if possible.

One approach I can think of is take advantage of the joblib parallel module, which can be achieve as below as shown under the function multi_prop.

def multi_prop(my_arr, ep):
    temp_ = np.full([10, my_shape[2], my_shape[3], my_shape[4]],
                    np.nan)
    for mt in range(0, my_shape[2]):
        for bd in range(0, my_shape[3]):
            for idx, (bt, lf) in enumerate(zip(ub_lb_pair[0], ub_lb_pair[1])):
                temp_[idx, mt, bd, :] = my_arr[ep, bt, mt, bd, :] - my_arr[ep, lf, mt, bd, :]
                x = 1
    return  temp_

dist_abs = Parallel(n_jobs=-1)(delayed(multi_prop)(arr, ep) for ep in range(0, my_shape[0]))

dist_abs = np.array(dist_abs)
bb = np.array_equal(store_cval, dist_abs)

But, I wonder whether the is a more numpythonic way to achieve the same objective.

2
  • Why a slice for just the last dimension? arr[:,bt,:,:,:] and arr[:,lf,:,:,:] or even arr[:,bt,...] and arr[:,lf,...]? Commented Feb 3, 2021 at 17:35
  • Say the last last dimension is of shape 2 and, and each of having the value [3,4] and [1,1] . So the diffrence between them are [2,3]. Does this make sense to you @hpaulj? Commented Feb 3, 2021 at 17:47

1 Answer 1

2

You don't really need any loops at all. Imagine this pair of fancy indices:

bt, lf = np.tril_indices (5, -1)

You are looking for

store_cval = arr[:, bt] - arr[:, lf]

Keep in mind that store_cval[ep, idx, mt, bd, :] = arr[ep, bt, mt, bd, :] - arr[ep, lf, mt, bd, :] is an implicit loop over the last index. They're all loops, and you don't need any of them over the hood.

A more general solution:

def diffs(arr, axis):
    a, b = np.tril_indices(arr.shape[axis], -1)
    ind1 = [slice(None) for _ in range(arr.ndim)]
    ind2 = ind1.copy()
    ind1[axis] = a
    ind2[axis] = b
    return arr[tuple(ind1)] - arr[tuple(ind2)]
Sign up to request clarification or add additional context in comments.

2 Comments

Hi @Mad, thanks for the detail explaination. For some reason, I am unable to reproduce the above result. For example, what is the value of axis. Also, my compiler provide the warning End of statement expected on the line ind1 = [slice(None) for _ in range arr.ndim]. And finally, do you mean tril to be np.tril_indices ?
@balandongiv. Sorry about the typos. Fixed now. In your case, you want to use combinations along the second axis, so axis=1

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.