Creating a numpy array of 3D coordinates from three 1D arrays

Question

Suppose I have three arbitrary 1D arrays, for example:

x_p = np.array((1.0, 2.0, 3.0, 4.0, 5.0))
y_p = np.array((2.0, 3.0, 4.0))
z_p = np.array((8.0, 9.0))

These three arrays represent sampling intervals in a 3D grid, and I want to construct a 1D array of three-dimensional vectors for all intersections, something like

points = np.array([[1.0, 2.0, 8.0],
                   [1.0, 2.0, 9.0],
                   [1.0, 3.0, 8.0],
                   ...
                   [5.0, 4.0, 9.0]])

Order doesn't actually matter for this. The obvious way to generate them:

npoints = len(x_p) * len(y_p) * len(z_p)
points = np.zeros((npoints, 3))
i = 0
for x in x_p:
    for y in y_p:
        for z in z_p:
            points[i, :] = (x, y, z)
            i += 1

So the question is... is there a faster way? I have looked but not found (possibly just failed to find the right Google keywords).

I am currently using this:

npoints = len(x_p) * len(y_p) * len(z_p)
points = np.zeros((npoints, 3))
i = 0
nz = len(z_p)
for x in x_p:
    for y in y_p:
        points[i:i+nz, 0] = x
        points[i:i+nz, 1] = y
        points[i:i+nz, 2] = z_p
        i += nz

but I feel like I am missing some clever fancy Numpy way?

This question has been marked as a duplicate; it is a similar question, but (obviously I am biased) I think my question is a simpler phrasing of a more general problem. I also think the answer on this question is better; use of meshgrid seems to be the simplest, fastest solution. — Andrew McLeod
– Andrew McLeod, Commented Aug 28, 2013 at 15:22
Also, the extension from 2D to 3D is non-obvious in my opinion. Seeing that the answers have similar structures implies that the straight-forward extensions are a good start, but, a priori, it wasn't clear that these would work. — tom10
– tom10, Commented May 11, 2015 at 18:16

Daniel · Accepted Answer · 2013-08-15 15:53:26Z

23

To use numpy mesh grid on the above example the following will work:

np.vstack(np.meshgrid(x_p,y_p,z_p)).reshape(3,-1).T

Numpy meshgrid for grids of more then two dimensions require numpy 1.7. To circumvent this and pulling the relevant data from the source code.

def ndmesh(*xi,**kwargs):
    if len(xi) < 2:
        msg = 'meshgrid() takes 2 or more arguments (%d given)' % int(len(xi) > 0)
        raise ValueError(msg)

    args = np.atleast_1d(*xi)
    ndim = len(args)
    copy_ = kwargs.get('copy', True)

    s0 = (1,) * ndim
    output = [x.reshape(s0[:i] + (-1,) + s0[i + 1::]) for i, x in enumerate(args)]

    shape = [x.size for x in output]

    # Return the full N-D matrix (not only the 1-D vector)
    if copy_:
        mult_fact = np.ones(shape, dtype=int)
        return [x * mult_fact for x in output]
    else:
        return np.broadcast_arrays(*output)

Checking the result:

print np.vstack((ndmesh(x_p,y_p,z_p))).reshape(3,-1).T

[[ 1.  2.  8.]
 [ 1.  2.  9.]
 [ 1.  3.  8.]
 ....
 [ 5.  3.  9.]
 [ 5.  4.  8.]
 [ 5.  4.  9.]]

For the above example:

%timeit sol2()
10000 loops, best of 3: 56.1 us per loop

%timeit np.vstack((ndmesh(x_p,y_p,z_p))).reshape(3,-1).T
10000 loops, best of 3: 55.1 us per loop

For when each dimension is 100:

%timeit sol2()
1 loops, best of 3: 655 ms per loop
In [10]:

%timeit points = np.vstack((ndmesh(x_p,y_p,z_p))).reshape(3,-1).T
10 loops, best of 3: 21.8 ms per loop

Depending on what you want to do with the data, you can return a view:

%timeit np.vstack((ndmesh(x_p,y_p,z_p,copy=False))).reshape(3,-1).T
100 loops, best of 3: 8.16 ms per loop

answered Aug 15, 2013 at 15:53

Daniel

19.6k7 gold badges64 silver badges74 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

Andrew McLeod Over a year ago

Fortunately I have Numpy 1.7. For my particular case where the dimensions will probably be at least 128, and can be more (large data sets), the mgrid/meshgrid solution seems to work best and is slightly cleaner than my original two-for-loop solution.

Daniel Over a year ago

As a note, if you are not using sparse arrays ndmesh is equivalent to np.meshgrid so all the timings and keywords will be identical.

Joe Kington · Accepted Answer · 2013-08-15 13:24:58Z

7

For your specific example, mgrid is quite useful.:

In [1]: import numpy as np
In [2]: points = np.mgrid[1:6, 2:5, 8:10]
In [3]: points.reshape(3, -1).T
Out[3]:
array([[1, 2, 8],
       [1, 2, 9],
       [1, 3, 8],
       [1, 3, 9],
       [1, 4, 8],
       [1, 4, 9],
       [2, 2, 8],
       [2, 2, 9],
       [2, 3, 8],
       [2, 3, 9],
       [2, 4, 8],
       [2, 4, 9],
       [3, 2, 8],
       [3, 2, 9],
       [3, 3, 8],
       [3, 3, 9],
       [3, 4, 8],
       [3, 4, 9],
       [4, 2, 8],
       [4, 2, 9],
       [4, 3, 8],
       [4, 3, 9],
       [4, 4, 8],
       [4, 4, 9],
       [5, 2, 8],
       [5, 2, 9],
       [5, 3, 8],
       [5, 3, 9],
       [5, 4, 8],
       [5, 4, 9]])

More generally, if you have specific arrays that you want to do this with, you'd use meshgrid instead of mgrid. However, you'll need numpy 1.7 or later for it to work in more than two dimensions.

edited Aug 15, 2013 at 13:24

answered Aug 15, 2013 at 13:18

Joe Kington

287k73 gold badges621 silver badges474 bronze badges

2 Comments

Daniel Over a year ago

Specifically with meshgrid in numpy 1.7: np.vstack(np.meshgrid(x_p,y_p,z_p)).reshape(3,-1).T

Andrew McLeod Over a year ago

This works well for me, even with large grids. I had intended to write the question to not assume that the spacings were even; actually in my case they are but I think for the sake of generality the meshgrid solution is the best answer for this question.

Saullo G. P. Castro · Accepted Answer · 2013-08-15 14:46:19Z

4

You can use itertools.product:

def sol1():
    points = np.array( list(product(x_p, y_p, z_p)) )

Another solution using iterators and np.take showed to be about 3X faster for your case:

from itertools import izip, product

def sol2():
    points = np.empty((len(x_p)*len(y_p)*len(z_p),3))

    xi,yi,zi = izip(*product( xrange(len(x_p)),
                              xrange(len(y_p)),
                              xrange(len(z_p))  ))

    points[:,0] = np.take(x_p,xi)
    points[:,1] = np.take(y_p,yi)
    points[:,2] = np.take(z_p,zi)
    return points

Timing results:

In [3]: timeit sol1()
10000 loops, best of 3: 126 µs per loop

In [4]: timeit sol2()
10000 loops, best of 3: 42.9 µs per loop

In [6]: timeit ops()
10000 loops, best of 3: 59 µs per loop

In [11]: timeit joekingtons() # with your permission Joe...
10000 loops, best of 3: 56.2 µs per loop

So, only my second solution and Joe Kington's solution would give you some performance gain...

edited Aug 15, 2013 at 14:46

answered Aug 15, 2013 at 13:06

Saullo G. P. Castro

59.4k28 gold badges191 silver badges244 bronze badges

5 Comments

chthonicdaemon Over a year ago

According to my timings, your second solution runs in 66% of the time of the OP's solution with the preallocation, which runs almost as fast as the mgrid solution. Perhaps you should show timings? You can also shave of a little time by using np.empty instead of np.zeros

Daniel Over a year ago

This is fine for small systems, however it scales very poorly. For such systems its likely much better to take the (14 microsecond!) performance penalty for cleaner code. Nd meshgrid is consistently faster across all sizes and can be easily used by copy/pasting the numpy source code.

Saullo G. P. Castro Over a year ago

@Ophion The problem would be if in x_p, y_p or z_p you had some non-integer values irregularly spaced like np.array([0.1, 0.2, 0.6, 1., 2., 3.]) and so forth...

Daniel Over a year ago

@Saullo Castro meshgrid takes this into account.

Andrew McLeod Over a year ago

When I tried this (albeit not in a particularly good way, only doing a few repeats) with timeit with three arrays of 256 each, it took a bit longer than a minute; faster than the naive for loop method (over two minutes) but slower than the faster for loop method which only looped over x_p and y_p; that took about 20-30 seconds.

CT Zhu · Accepted Answer · 2013-08-15 17:01:12Z

1

For those who had to stay with numpy <1.7.x, here is a simple two-liner solution:

g_p=np.zeros((x_p.size, y_p.size, z_p.size))
array_you_want=array(zip(*[item.flatten() for item in \
                                 [g_p+x_p[...,np.newaxis,np.newaxis],\
                                  g_p+y_p[...,np.newaxis],\
                                  g_p+z_p]]))

Very easy to expand to even higher dimenision as well. Cheers!

answered Aug 15, 2013 at 17:01

CT Zhu

54.6k18 gold badges125 silver badges136 bronze badges

Collectives™ on Stack Overflow

Creating a numpy array of 3D coordinates from three 1D arrays

4 Answers 4

2 Comments

2 Comments

5 Comments

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

4 Answers 4

2 Comments

2 Comments

5 Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related