3

I have a list of arrays (permutations) that I need to clean up. This is how my current list looks like:

>>>permutations
[array([1, 2, 6, 7]),
 array([1, 2, 6, 7]),
 array([1, 2, 6, 7]),
 array([1, 2, 3, 8]),
 array([1, 2, 3, 8]),
 array([1, 2, 3, 8]),
 array([2, 3, 4, 9]),
 array([2, 3, 4, 9]),
 array([2, 3, 4, 9]),
 array([ 3,  4,  5, 10]),
 array([ 3,  4,  5, 10]),
 array([ 3,  4,  5, 10]),
 array([ 4,  5,  6, 11]),
 array([ 4,  5,  6, 11]),
 array([ 4,  5,  6, 11]),
 array([ 1,  5,  6, 12]),
 array([ 1,  5,  6, 12]),
 array([ 1,  5,  6, 12])]

What I want:

>>>neat_perm
[(array([1, 2, 6, 7]),3), 
(array([1, 2, 3, 8]),3),
(array([2, 3, 4, 9]),3)
(array([3, 4, 5, 10]), 3),
(array([4, 5, 6, 11]), 3),
(array([1, 5, 6, 12]), 3)]

What I want to do, is to create a list of tuples, where the first element of the tuple is the array, and the second element of the tuple is the number of times it was repeated in permutations.

The straightforward, brute force method is to do the O(n^2) np.array_equal over the array to ensure there are no repeats. The problem is the algorithmic complexity. list(set(permutations)), permutations.count() does not work as np arrays are not hashable.

I would appreciate any advice you have for me, to get this more efficient, in terms of lines of code required, or time/memory complexity!

1 Answer 1

6

One solution is to use np.unique() with return_counts = True, and zip the resulting unique arrays and their counts:

from numpy import array
import numpy as np

permutations = [array([1, 2, 6, 7]),
         array([1, 2, 6, 7]),
         array([1, 2, 6, 7]),
         array([1, 2, 3, 8]),
         array([1, 2, 3, 8]),
         array([1, 2, 3, 8]),
         array([2, 3, 4, 9]),
         array([2, 3, 4, 9]),
         array([2, 3, 4, 9]),
         array([ 3,  4,  5, 10]),
         array([ 3,  4,  5, 10]),
         array([ 3,  4,  5, 10]),
         array([ 4,  5,  6, 11]),
         array([ 4,  5,  6, 11]),
         array([ 4,  5,  6, 11]),
         array([ 1,  5,  6, 12]),
         array([ 1,  5,  6, 12]),
         array([ 1,  5,  6, 12])]


>>> list(zip(*np.unique(permutations, return_counts = True, axis = 0)))

[(array([1, 2, 3, 8]), 3),
 (array([1, 2, 6, 7]), 3),
 (array([ 1,  5,  6, 12]), 3),
 (array([2, 3, 4, 9]), 3),
 (array([ 3,  4,  5, 10]), 3),
 (array([ 4,  5,  6, 11]), 3)]
Sign up to request clarification or add additional context in comments.

2 Comments

Thank you for your answer @sacuL! what is the "*" behind np.unique do?
np.unique with return_counts = True returns a tuple of the arrays and their counts, so the * operator unpacks it into two separate arrays, which can then be zipped together

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.