2

So, this is something, that should be pretty easy, but it seems to take an enormous amount of time for me: I have a numpy array with only two values (example 0 and 255) and I want to invert the matrix in that way, that all values swap (0 becomes 255 and vice versa). The matrices are about 2000³ entries big, so this is serious work! I first tried the numpy.invert method, which is not exactly what I expected. So I tried to do that myself by "storing" the values and then override them:

for i in range(array.length):
            array[i][array[i]==255]=1
            array[i][array[i]==0]=255
            array[i][array[i]==1]=0

which is behaving as expected, but taking a long time (I guess due to the for loop?). Would that be faster if I implement that as a multithreaded calculation, where every thread "inverts" a smaller sub-array? Or is there another way of doing that more conveniently?

2
  • 3
    Side note: I advise to not use array for the name of your array: NumPy users expect array to mean numpy.array. Furthermore, when you paste code into a shell after from numpy import * (or from pylab import *), your variable shadows NumPy's array. Commented Apr 9, 2013 at 12:34
  • 1
    I used array just to clearify the input type here ;) Commented Apr 9, 2013 at 12:48

4 Answers 4

8

In addition to @JanneKarila's and @EOL's excellent suggestions, it's worthwhile to show a more efficient approach to using a mask to do the swap.

Using a boolean mask is more generally useful if you have a more complex comparison than simply swapping two values, but your example uses it in a sub-optimal way.

Currently, you're making multiple temporary copies of the boolean "mask" array (e.g. array[i] == blah) in your example above and performing multiple assignments. You can avoid this by just making the "mask" boolean array once and the inverting it.

If you have enough ram for a temporary copy (of bool dtype), try something like this:

mask = (data == 255)
data[mask] = 0
data[~mask] = 255

Alternately (and equivalently) you could use numpy.where:

data = numpy.where(data == 255, 0, 255)

If you were using a loop to avoid making a full temporary copy, and need to conserve ram, adjust your loop to be something more like this:

for i in range(len(array)):
     mask = (array[i] == 255)
     array[mask] = 0
     array[~mask] = 255

All that having been said, either subtraction or XOR is the way to go in this case, especially if you preform the operation in-place!

Sign up to request clarification or add additional context in comments.

1 Comment

Both of these avoid making the third copy that OP had with the ==1 mask.
4

To swap 0 and 255, you can use XOR if the data type is one of the integer types.

array ^= 255

2 Comments

This only works if the array contains integers, which is not obvious from the question. No downvote, though. :)
Actually, the simplest ideas are the one, that work the best! Thanks a lot!
4

You can simply do:

arr_inverted = 255-arr

This converts all the elements one by one (255 gives 0 and 0 gives 255). More generally, if you only have two values a and b, the "inversion" is simply done with (a+b)-arr. This also works if the two values are not integers (like floats or complex numbers).

As Jaime pointed out, if memory is a concern subtract(255, arr, out=arr) swaps the values of arr in-place.

If you more generally have integers in your array, Janne Karila's XOR in-place solution has the advantage of being more concise than the difference in-place solution suggested above. It can be generalized as arr ^= (a^b), for swapping two integers a and b.

The execution times are similar between both methods (with a 200×200×200 array of uint8 integers, through IPython):

>>> arr = np.random.choice((0, 255), (200, 200, 200)).astype('uint8')
>>> %timeit np.bitwise_xor(255, arr, out=arr)
100 loops, best of 3: 7.65 ms per loop
>>> %timeit np.subtract(255, arr, out=arr)
100 loops, best of 3: 7.69 ms per loop

If your array is of type uint8, arr_inverted = ~a takes the same time, for swapping 0 and 255 (the ~ operator inverts all the bits), and is less general, so it's not worth it (tested with a 200×200×200 array).

1 Comment

If memory is a concern, np.subtract(255, arr, out=arr) will do your method in-place.
1

"I first tried the numpy.invert method, which is not exactly what I expected."

Numpy.invert is exactly what you need. Can you describe what happened? Did you use an unsigned byte for storage rather than a signed datatype or an integer?

Unsigned byte + numpy.invert should do exactly what you want.

[You should also see faster performance in numpy with unsigned bytes rather than longer or signed datatypes]

3 Comments

That's only true to uint8, not for unsigned integers in general. The OP's array probably just isn't uint8.
Hello Joe. Thanks for your comment, but I actually didn't say 'unsigned integers in general', I said unsigned byte, -> i.e. uint8. If he is only storing 0 and 255, then the OP probably is using uint8. Packing them into 1s/0s makes even more sense though.
@that_guy Welcome to Stack Overflow! You're right but @Joe just wanted to clarify since it likely wouldn't have been obvious to the OP (or myself, e.g.) what exactly you meant. Try to be explicit as possible in your answers, e.g. by saying "Invert should work if you're using an unsigned byte (uint8), so perhaps you're using a signed datatype or integer." BTW, for Joe to be notified, you must precede his name with @, as in @Joe.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.