9

Is is possible to have a 3-D record array in numpy? (Maybe this is not possible, or there is simply an easier way to do things too -- I am open to other options).

Assume I want an array that holds data for 3 variables (say temp, precip, humidity), and each variable's data is actually a 2-d array of 2 years (rows) and 6 months of data (columns), I could create that like this:

>>> import numpy as np

>>> d = np.array(np.arange(3*2*6).reshape(3,2,6))
>>> d

#
# comments added for explanation...
#        jan   feb   mar   apr   may   Jun    

array([[[ 0,    1,    2,    3,    4,    5],   # yr1  temp
        [ 6,    7,    8,    9,   10,   11]],  # yr2  temp

       [[12,   13,   14,   15,   16,   17],   # yr1  precip
        [18,   19,   20,   21,   22,   23]],  # yr2  precip

       [[24,   25,   26,   27,   28,   29],   # yr1  humidity
        [30,   31,   32,   33,   34,   35]]]) # yr2  humidity

I'd like to be able to type:

>>> d['temp']

and get this (the first "page" of the data):

>>> array([[ 0,  1,  2,  3,  4,  5],
           [ 6,  7,  8,  9, 10, 11]])

or:

>>> d['Jan']   # assume months are Jan-June

and get this

>>> array([[0,6],
          [12,18],
          [24,30]])

I have been through this: http://www.scipy.org/RecordArrays a number of times, but don't see how set up what I am after.

1 Answer 1

12

Actually, you can do something similar to this with structured arrays, but it's generally more trouble than it's worth.

What you want is basically labeled axes.

Pandas (which is built on top of numpy) provides what you want, and is a better choice if you want this type of indexing. There's also Larry (for labeled array), but it's largely been superseded by Pandas.

Also, you should be looking at the numpy documentation for structured arrays for info on this, rather than an FAQ. The numpy documentation has considerably more information. http://docs.scipy.org/doc/numpy/user/basics.rec.html

If you do want to take a pure-numpy route, note that structured arrays can contain multidimensional arrays. (Note the shape argument when specifying a dtype.) This will rapidly get more complex than it's worth, though.

In pandas terminology, what you want is a Panel. You should probably get familiar with DataFrames first, though.

Here's how you'd do it with Pandas:

import numpy as np
import pandas

d = np.array(np.arange(3*2*6).reshape(3,2,6))

dat = pandas.Panel(d, items=['temp', 'precip', 'humidity'], 
                      major_axis=['yr1', 'yr2'], 
                      minor_axis=['jan', 'feb', 'mar', 'apr', 'may', 'jun'])

print dat['temp']
print dat.major_xs('yr1')
print dat.minor_xs('may')
Sign up to request clarification or add additional context in comments.

1 Comment

hmm, ok, that partially validates my confusion. Pandas looks ideal, thanks!

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.