Creating a numpy array in a particular format

Question

This is trainData

[[-214. -153.  -58. ...,   36.  191.  -37.]
[-139.  -73.   -1. ...,   11.   76.  -14.]
[ -76.  -49. -307. ...,   41.  228.  -41.]
..., 
[ -32.  -49.   49. ...,  -26.  133.  -32.]
[-124.  -79.  -37. ...,   39.  298.   -3.]
[-135. -186.  -70. ...,  -12.  790.  -10.]]

This is target

[[0], [0], [0], [0], [0], [0], [0], [0], [0], [0], [0], [0], [0], [0], [0], [0], [0], [0], [0], [0], [0], [0], [0], [0], [0], [0], [0], [1], [1], [1], [1], [1], [1], [1], [1], [1], [1], [1]]

I want to create a numpy array using trainData and target which looks like this

[
 [[-214. -153.  -58. ...,   36.  191.  -37.], [0]]
 [[-139.  -73.   -1. ...,   11.   76.  -14.], [0]]
 [[ -76.  -49. -307. ...,   41.  228.  -41.], [0]]
 ..., 
 [[ -32.  -49.   49. ...,  -26.  133.  -32.], [1]]
 [[-124.  -79.  -37. ...,   39.  298.   -3.], [1]]
 [[-135. -186.  -70. ...,  -12.  790.  -10.], [1]]
]

hpaulj · Accepted Answer · 2015-03-01 07:49:00Z

Mixing arrays with different shapes requires some compromises. The normal array has constant dimensions all around.

Sample data:

In [343]: td = np.arange(20.).reshape(5,4)
In [344]: target=np.arange(5).reshape(5,1)*10

You could combine them into one 2d array, by concatenation, adding target as an extra column to td:

In [345]: np.hstack([td,target])
Out[345]: 
array([[  0.,   1.,   2.,   3.,   0.],
       [  4.,   5.,   6.,   7.,  10.],
       [  8.,   9.,  10.,  11.,  20.],
       [ 12.,  13.,  14.,  15.,  30.],
       [ 16.,  17.,  18.,  19.,  40.]])

Something that will appear to be closer to your goal is a structured array. It is easiest to make an empty one of the right shape, and then fill it with the data

In [346]: combine=np.empty((5,),dtype=[('td','f',(4,)),('target','i',(1,))])

then fill it field by field

In [347]: combine['td']=td
In [348]: combine['target']=target

The result:

In [349]: combine
Out[349]: 
array([([0.0, 1.0, 2.0, 3.0], [0]), ([4.0, 5.0, 6.0, 7.0], [10]),
       ([8.0, 9.0, 10.0, 11.0], [20]), ([12.0, 13.0, 14.0, 15.0], [30]),
       ([16.0, 17.0, 18.0, 19.0], [40])], 
      dtype=[('td', '<f4', (4,)), ('target', '<i4', (1,))])

Note, though, that each 'row' is displayed as ([...].[...])

The original data can be 'recovered' with combine['td'] and combine['target']. And an element of the array as combine[0].

But combine doesn't do a whole lot for you. You can do math with the fields like combine['td']*combine['target'], but you could to that with td*target. You can't do combine[:2] *= 2, i.e. act on both fields at once.

Community · Accepted Answer · 2017-05-23 12:28:32Z

0

I believe normal numpy arrays enforce equal shape in all dimensions, meaning you can't have some entries with greater size than others. See this related answer on the matter. If you just want to join your two arrays though, you can do that like so:

import numpy
trainingData = numpy.array([[-214., -153, -58., 5],[-139, -73, -1, 2]])
target = numpy.array([[0],[1]])
combined = numpy.concatenate((trainingData,target), axis=1)

If you really want to keep the target and training data in separate subarrays of different length and have to use numpy, you could just use a numpy object array, e.g.

combined = numpy.array(zip(trainingData,target))

But really at the point you'd probably just be better off using a python list or some custom container object.

edited May 23, 2017 at 12:28

CommunityBot

11 silver badge

answered Mar 1, 2015 at 7:13

lemonhead

5,5681 gold badge17 silver badges27 bronze badges

Collectives™ on Stack Overflow

Creating a numpy array in a particular format

2 Answers 2

1 Comment

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

1 Comment

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related