0

I'm trying to efficiently map a N * 1 numpy array of ints to a N * 3 numpy array of floats using a ufunc.

What I have so far:

map = {1: (0, 0, 0), 2: (0.5, 0.5, 0.5), 3: (1, 1, 1)}
ufunc = numpy.frompyfunc(lambda x: numpy.array(map[x], numpy.float32), 1, 1)

input = numpy.array([1, 2, 3], numpy.int32)

ufunc(input) gives a 3 * 3 array with dtype object. I'd like this array but with dtype float32.

2
  • 2
    map and input are Python builtin functions. It is best not to assign new values to these names, since it makes it hard to access the Python builtins. Commented Aug 31, 2012 at 1:16
  • The documentation of frompyfunc says that "The returned ufunc always returns PyObject arrays". Whatever the evil reason for this is, there is a fairly easy workaround: submit an output matrix of appropriate entry type as out argument. Commented Mar 14, 2016 at 16:47

4 Answers 4

1

You could use np.hstack:

import numpy as np
mapping = {1: (0, 0, 0), 2: (0.5, 0.5, 0.5), 3: (1, 1, 1)}
ufunc = np.frompyfunc(lambda x: np.array(mapping[x], np.float32), 1, 1, dtype = np.float32)

data = np.array([1, 2, 3], np.int32)
result = np.hstack(ufunc(data))
print(result)
# [ 0.   0.   0.   0.5  0.5  0.5  1.   1.   1. ]
print(result.dtype)
# float32
print(result.shape)
# (9,)
Sign up to request clarification or add additional context in comments.

Comments

1

If your mapping is a numpy array, you can just use fancy indexing for this:

>>> valmap = numpy.array([(0, 0, 0), (0.5, 0.5, 0.5), (1, 1, 1)])
>>> input = numpy.array([1, 2, 3], numpy.int32)
>>> valmap[input-1]
array([[ 0. ,  0. ,  0. ],
       [ 0.5,  0.5,  0.5],
       [ 1. ,  1. ,  1. ]])

Comments

1

You can use ndarray fancy index to get the same result, I think it should be faster than frompyfunc:

map_array = np.array([[0,0,0],[0,0,0],[0.5,0.5,0.5],[1,1,1]], dtype=np.float32)
index = np.array([1,2,3,1])
map_array[index]

Or you can just use list comprehension:

map = {1: (0, 0, 0), 2: (0.5, 0.5, 0.5), 3: (1, 1, 1)}
np.array([map[i] for i in [1,2,3,1]], dtype=np.float32)    

1 Comment

The input list is very large so I'm trying to avoid creating intermediate lists or arrays.
1

Unless I misread the doc, the output of np.frompyfunc on a scalar a object indeed: when using a ndarray as input, you'll get a ndarray with dtype=obj.

A workaround is to use the np.vectorize function:

F = np.vectorize(lambda x: mapper.get(x), 'fff')

Here, we force the dtype of F's output to be 3 floats (hence the 'fff').

>>> mapper = {1: (0, 0, 0), 2: (0.5, 1.0, 0.5), 3: (1, 2, 1)}
>>> inp = [1, 2, 3]
>>> F(inp)
(array([ 0. ,  0.5,  1. ], dtype=float32), array([ 0.,  0.5,  1.], dtype=float32), array([ 0. ,  0.5,  1. ], dtype=float32))

OK, not quite what we want: it's a tuple of three float arrays (as we gave 'fff'), the first array being equivalent to [mapper[i][0] for i in inp]. So, with a bit of manipulation:

>>> np.array(F(inp)).T
array([[ 0. ,  0. ,  0. ],
       [ 0.5,  0.5,  0.5],
       [ 1. ,  1. ,  1. ]], dtype=float32)

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.