efficient numpy array creation

Question

Given x, I want to produce x, log(x) as a numpy array whereby x has shape s, the result has shape (*s, 2). What's the neatest way to do this? x may just be a float, in which case I want a result with shape (2,).

An ugly way to do this is:

import numpy as np

x = np.asarray(x)
result = np.empty((*x.shape, 2))
result[..., 0] = x
result[..., 1] = np.log(x)

unutbu · Accepted Answer · 2016-03-29 14:54:26Z

5

It's important to separate aesthetics from performance. Sometimes ugly code is fast. In fact, that's the case here. Although creating an empty array and then assigning values to slices may not look beautiful, it is fast.

import numpy as np
import timeit 
import itertools as IT
import pandas as pd

def using_empty(x):
    x = np.asarray(x)
    result = np.empty(x.shape + (2,))
    result[..., 0] = x
    result[..., 1] = np.log(x)
    return result

def using_concat(x):
    x = np.asarray(x)
    return np.concatenate([x, np.log(x)], axis=-1).reshape(x.shape+(2,), order='F')

def using_stack(x):
    x = np.asarray(x)
    return np.stack([x, np.log(x)], axis=x.ndim)

def using_ufunc(x):
    return np.array([x, np.log(x)])
using_ufunc = np.vectorize(using_ufunc, otypes=[np.ndarray])

tests = [np.arange(600),
         np.arange(600).reshape(20,30),
         np.arange(960).reshape(8,15,8)]

# check that all implementations return the same result
for x in tests:
    assert np.allclose(using_empty(x), using_concat(x))
    assert np.allclose(using_empty(x), using_stack(x))


timing = []
funcs = ['using_empty', 'using_concat', 'using_stack', 'using_ufunc']
for test, func in IT.product(tests, funcs):
    timing.append(timeit.timeit(
        '{}(test)'.format(func), 
        setup='from __main__ import test, {}'.format(func), number=1000))

timing = pd.DataFrame(np.array(timing).reshape(-1, len(funcs)), columns=funcs)
print(timing)

yields, the following timeit results on my machine:

   using_empty  using_concat  using_stack  using_ufunc
0     0.024754      0.025182     0.030244     2.414580
1     0.025766      0.027692     0.031970     2.408344
2     0.037502      0.039644     0.044032     3.907487

So using_empty is the fastest (of the options tested applied to tests).

Note that np.stack does exactly what you want, so

np.stack([x, np.log(x)], axis=x.ndim)

looks reasonably pretty, but it is also the slowest of the three options tested.

Note that along with being much slower, using_ufunc returns an array of object dtype:

In [236]: x = np.arange(6)

In [237]: using_ufunc(x)
Out[237]: 
array([array([  0., -inf]), array([ 1.,  0.]),
       array([ 2.        ,  0.69314718]),
       array([ 3.        ,  1.09861229]),
       array([ 4.        ,  1.38629436]), array([ 5.        ,  1.60943791])], dtype=object)

which is not the same as the desired result:

In [240]: using_empty(x)
Out[240]: 
array([[ 0.        ,        -inf],
       [ 1.        ,  0.        ],
       [ 2.        ,  0.69314718],
       [ 3.        ,  1.09861229],
       [ 4.        ,  1.38629436],
       [ 5.        ,  1.60943791]])

In [238]: using_ufunc(x).shape
Out[238]: (6,)

In [239]: using_empty(x).shape
Out[239]: (6, 2)

edited Mar 29, 2016 at 14:54

answered Mar 28, 2016 at 2:34

unutbu

886k197 gold badges1.9k silver badges1.7k bronze badges

Sign up to request clarification or add additional context in comments.

9 Comments

Neil G Over a year ago

I like stack best actually. I don't usually optimize my Python code for speed :)

Neil G Over a year ago

Would you mind adding the @ufunc decorator as another way to do this?

unutbu Over a year ago

I'm not familiar with the @ufunc decorator. Are you referring to this?

Neil G Over a year ago

Sorry, I meant this: docs.scipy.org/doc/numpy-1.10.1/reference/generated/…

unutbu Over a year ago

Hm, I'm not sure how that would work. My naive attempt, def using_ufunc(x): return x, np.log(x); using_ufunc = np.vectorize(using_ufunc); using_ufunc(np.arange(6)) returns a tuple of arrays, not the desired result. Do you see how it can be done with np.vectorize?

|

Collectives™ on Stack Overflow

efficient numpy array creation

1 Answer 1

9 Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

9 Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related