2

I'm looking for some sort of paradigm or implementation to efficiently handle many sets of coupled N-dim arrays (ndarrays). Specifically, I'm hoping for an implementation that allows me to slice an array of entire objects (e.g. someObjs = objects[100:200]), or individual attributes of those objects (e.g. somePars1 = objects.par1[100:200]) --- at the same time.

To expand on the above example, I could construct the following subsets in two ways:

def subset1(objects, beg, end):
    pars1 = [ obj.par1 for obj in objects[beg:end] ]
    pars2 = [ obj.par2 for obj in objects[beg:end] ]
    return pars1, pars2

def subset2(objects, beg, end):
    pars1 = objects.par1[beg:end]
    pars2 = objects.par2[beg:end]
    return pars1, pars2

And they would be identical.


Edit:

One approach would be to override the __getitem__ (etc) methods, something like,

class Objects(object):
    def __init__(self, p1, p2):
        self.par1 = p1
        self.par2 = p2
    ...
    def __getitem__(self, key):
        return Objects(self.p1[key], self.p2[key])

But this is horribly inefficient, and it duplicates the subset. Perhaps there's someway to return a view of the subset??

7
  • 1
    I don't quite understand the question. Are you trying to find a language that allows you to place the index in either position? This is antithetical to the structure of most languages. If you have a list of objects, the subscript expression must be applied directly to the list, not to the element. Your code is correct either way, depending on how you design your objects. However, you cannot have this dual nature in a language that honours type characteristics. Commented Sep 28, 2015 at 21:24
  • @Prune, I don't think that is the case. See the example I added. Achieving this functionality is certainly possible --- but I can't think of any way of doing it effectively/efficiently. Commented Sep 28, 2015 at 21:33
  • I understand now; thanks. Do keep in mind that this is inherently inefficient: you're accepting the natural structure, but then overlaying an artificial structure on that. Every reference to the artificial structure -- the view that you want -- requires dismantling and rearranging elements of the "correct" organization. However, a view pattern would likely be the way to go for maintainability. I don't know whether this tells you anything new; I'm likely just reinforcing what you feared. Commented Sep 28, 2015 at 21:41
  • 1
    I think this can be a good approach, actually. One thing to keep in mind is that slicing of numpy arrays always returns a view (as long as you use a slice and not "fancy indexing" with a list/tuple). Therefore, your example actually doesn't duplicate that much memory at all, provided you enforce contiguous slices. However, the usual caveats with sharing views apply: If you modify one, you're modifying all and you need to be aware of what makes copies and what doesn't when using things. Commented Sep 28, 2015 at 21:56
  • 1
    The numpy documentation goes over it in quite a bit of detail: docs.scipy.org/doc/numpy/reference/… Not to plug one of my own answers too much, but you might find this useful: stackoverflow.com/questions/4370745/view-onto-a-numpy-array/… It specifically deals with how to avoid copies and ensure views. I'm too short on time at the moment for a full answer, so if someone wants to condense things and write one up, please feel free! Commented Sep 28, 2015 at 22:05

1 Answer 1

2

Object array and object with array approach

A sample object class

In [56]: class MyObj(object):
   ....:     def __init__(self, par1,par2):
   ....:         self.par1=par1
   ....:         self.par2=par2

An array of those objects - little more than a list with an array wrapper

In [57]: objects=np.array([MyObj(1,2),MyObj(3,4),MyObj(2,3),MyObj(10,11)])
In [58]: objects
Out[58]: 
array([<__main__.MyObj object at 0xb31b196c>,
       <__main__.MyObj object at 0xb31b116c>,
       <__main__.MyObj object at 0xb31b13cc>,
       <__main__.MyObj object at 0xb31b130c>], dtype=object)

`subset`` type of selection:

In [59]: [obj.par1 for obj in objects[1:-1]]
Out[59]: [3, 2]

Another class that can contain such an array. This is simpler than defining an array subclass:

In [60]: class MyObjs(object):
   ....:     def __init__(self,anArray):
   ....:         self.data=anArray
   ....:     def par1(self):
   ....:         return [obj.par1 for obj in self.data]

In [61]: Obs = MyObjs(objects)
In [62]: Obs.par1()
Out[62]: [1, 3, 2, 10]

subset2 type of selection:

In [63]: Obs.par1()[1:-1]
Out[63]: [3, 2]

For now par1 is a method, but could made a property, permitting Obs.par1[1:-1] syntax.

If par1 returned an array instead of a list, indexing would be more powerful.

If MyObjs had a __getitem__ method, then it could be indexed with

Obs[1:-1]

That method could be defined in various ways, though the simplest is to apply the indexing 'slice' to the 'data':

def __getitem__(self, *args):
    # not tested
    return MyObjs(self.data.__getitem(*args))

I'm focusing just on syntax, not on efficiency. In general numpy arrays of general objects is not very fast or powerful. Such arrays are basically lists of pointers to the objects.

Structured array and recarray version

Another possiblity is np.recarray. Another poster was just asking about their names. They essentially are structured array where fields can be accessed as attributes.

With a structured array definition:

In [64]: dt = np.dtype([('par1', int), ('par2', int)])
In [66]: Obj1 = np.array([(1,2),(3,4),(2,3),(10,11)], dtype=dt)
In [67]: Obj1
Out[67]: 
array([(1, 2), (3, 4), (2, 3), (10, 11)], 
      dtype=[('par1', '<i4'), ('par2', '<i4')])
In [68]: Obj1['par1'][1:-1]
Out[68]: array([3, 2])
In [69]: Obj1[1:-1]['par1']
Out[69]: array([3, 2])

or as recarray

In [79]: Objrec=np.rec.fromrecords(Obj1,dtype=dt)
In [80]: Objrec.par1
Out[80]: array([ 1,  3,  2, 10])
In [81]: Objrec.par1[1:-1]
Out[81]: array([3, 2])
In [82]: Objrec[1:-1].par1
Out[82]: array([3, 2])
Sign up to request clarification or add additional context in comments.

1 Comment

The recarray solution looks fantastic! I'll need to play with this a little, but I think that'll be perfect. Thanks so much!

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.