numpy structured array no shape information?

Question

Why is the shape of a single row numpy structured array not defined ( '()') and whats the common "workaround"?

import io
fileWrapper = io.StringIO("-0.09469 0.032987 0.061009 0.0588")

a =np.loadtxt(fileWrapper,dtype=np.dtype([('min', (float,2) ), ('max',(float,2) )]), delimiter= " ", comments="#");
print(np.shape(a), a)

Output: () ([-0.09469, 0.032987], [0.061009, 0.0588])

this some how inconsistent wrong behaviour makes code syntax mad, i.e. distinguish between single row arrays or bigger ones — Gabriel
– Gabriel, Commented Apr 14, 2015 at 13:58

Warren Weckesser · Accepted Answer · 2015-04-14 14:32:45Z

Short answer: Add the argument ndmin=1 to the loadtxt call.

Long answer:

The shape is () for the same reason that reading a single floating point value with loadtxt returns an array with shape ():

In [43]: a = np.loadtxt(['1.0'])

In [44]: a.shape
Out[44]: ()

In [45]: a
Out[45]: array(1.0)

By default, loadtxt uses the squeeze function to eliminate trivial (i.e. length 1) dimensions in the array that it returns. In my example above, it means the result is a "scalar array"--an array with shape ().

When you give loadtxt a structured dtype, the structure defines the fields of a single element of the array. It is common to think of these fields as "columns", but structured arrays will make more sense if you consistently think of them as what they are: arrays of structures with fields. If your data file had two lines, the array returned by loadtxt would be an array with shape (2,). That is, it is a one-dimensional array with length 2. Each element of the array is a structure whose fields are defined by the given dtype. When the input file has only a single line, the array would have shape (1,), but loadtxt squeezes that to be a scalar array with shape ().

To force loadtxt to always return a one-dimensional array, even when there is a single line of data, use the argument ndmin=1.

For example, here's a dtype for a structured array:

In [58]: dt = np.dtype([('x', np.float64), ('y', np.float64)])

Read one line using that dtype. The result has shape ():

In [59]: a = np.loadtxt(['1.0 2.0'], dtype=dt)

In [60]: a.shape
Out[60]: ()

Use ndmin=1 to ensure that even an input with a single line results in a one-dimensional array:

In [61]: a = np.loadtxt(['1.0 2.0'], dtype=dt, ndmin=1)

In [62]: a.shape
Out[62]: (1,)

In [63]: a
Out[63]: 
array([(1.0, 2.0)], 
      dtype=[('x', '<f8'), ('y', '<f8')])

Collectives™ on Stack Overflow

numpy structured array no shape information?

1 Answer 1

1 Comment

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

1 Comment

Your Answer

Sign up or log in

Post as a guest

Related