3

I have a 3D array as follow, 'b', which I want to represent an array of 2-D array. I want to remove the duplicates of my 2-D arrays and get the unique ones.

>>> a = [[[1, 2], [1, 2]], [[1, 2], [4, 5]], [[1, 2], [1, 2]]]
>>> b = numpy.array(a)
>>> b
array([[[1, 2],
        [1, 2]],

       [[1, 2],
        [4, 5]],

       [[1, 2],
        [1, 2]]])

In this above example, I really want to return the following because there exist one duplicate which I want to remove.

unique = array([[[1, 2],
                 [1, 2]],

                 [[1, 2],
                  [4, 5]])

How should do this with numpy package? Thanks

1
  • Curious if any of the posted solutions work for you? Commented Dec 16, 2016 at 11:26

4 Answers 4

0

See previous answer: Remove duplicate rows of a numpy array convert to array of tuples and then apply np.unique()

Sign up to request clarification or add additional context in comments.

Comments

0

Converting to tuple and back again is probably going to be quire expensive, instead you can do a generalized view:

def unique_by_first(a):
    tmp = a.reshape(a.shape[0], -1)
    b = np.ascontiguousarray(tmp).view(np.dtype((np.void, tmp.dtype.itemsize * tmp.shape[1])))
    _, idx = np.unique(b, return_index=True)
    return  a[idx].reshape(-1, *a.shape[1:])

Usage:

print unique_by_first(a) 
[[[1 2]
  [1 2]]

 [[1 2]
  [4 5]]]

Effectively, a generalization of previous answers.

Comments

0

You can convert each such 2D slice off the last two axes into a scalar each by considering them as indices on a multi-dimensional grid. The intention is to map each such slice to a scalar based on their uniqueness. Then, using those scalars, we could use np.unique to keep one instance only.

Thus, an implementation would be -

idx = np.ravel_multi_index(a.reshape(a.shape[0],-1).T,a.max(0).ravel()+1)
out = a[np.sort(np.unique(idx, return_index=1)[1])]

Sample run -

In [43]: a
Out[43]: 
array([[[8, 1],
        [2, 8]],

       [[3, 8],
        [3, 4]],

       [[2, 4],
        [1, 0]],

       [[3, 0],
        [4, 8]],

       [[2, 4],
        [1, 0]],

       [[8, 1],
        [2, 8]]])

In [44]: idx = np.ravel_multi_index(a.reshape(a.shape[0],-1).T,a.max(0).ravel()+1)

In [45]: a[np.sort(np.unique(idx, return_index=1)[1])]
Out[45]: 
array([[[8, 1],
        [2, 8]],

       [[3, 8],
        [3, 4]],

       [[2, 4],
        [1, 0]],

       [[3, 0],
        [4, 8]]])

If you don't mind the order of such slices being maintained, skip the np.sort() at the last step.

Comments

0

Reshape, find the unique rows, then reshape again.

Finding unique tuples by converting to a set.

import numpy as np
a = [[[1, 2], [1, 2]], [[1, 2], [4, 5]], [[1, 2], [1, 2]]]
b = np.array(a)

new_array = [tuple(row) for row in b.reshape(3,4)]
uniques = list(set(new_array))

output = np.array(uniques).reshape(len(uniques), 2, 2)
output

Out[131]: 
array([[[1, 2],
        [1, 2]],

       [[1, 2],
        [4, 5]]])

2 Comments

This solution doesn't work: ValueErrorTraceback (most recent call last) <ipython-input-191-754759117946> in <module>() 6 uniques = np.unique(new_array) 7 ----> 8 output = uniques.reshape(len(uniques), 2, 2) 9 output ValueError: cannot reshape array of size 4 into shape (4,2,2)
Good point. Fixed now to use a set to find uniques. Thanks.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.