8

Is there a way to append a row to a NumPy rec.array()? For example,

x1=np.array([1,2,3,4])
x2=np.array(['a','dd','xyz','12'])
x3=np.array([1.1,2,3,4])
r = np.core.records.fromarrays([x1,x2,x3],names='a,b,c')

append(r,(5,'cc',43.0),axis=0)

The easiest way would to extract all the column as nd.array() types, add the separate elements to each column, and then rebuild the rec.array(). This method would be memory inefficient unfortunately. Is there another way to this without separating the rebuilding the rec.array()?

Cheers,

Eli

3 Answers 3

7

You can resize numpy arrays in-place. This is faster than converting to lists and then back to numpy arrays, and it uses less memory too.

print (r.shape)
# (4,)
r.resize(5)   
print (r.shape)
# (5,)
r[-1] = (5,'cc',43.0)
print(r)

# [(1, 'a', 1.1000000000000001) 
#  (2, 'dd', 2.0) 
#  (3, 'xyz', 3.0) 
#  (4, '12', 4.0)
#  (5, 'cc', 43.0)]

If there is not enough memory to expand an array in-place, the resizing (or appending) operation may force NumPy to allocate space for an entirely new array and copy the old data to the new location. That, naturally, is rather slow so you should try to avoid using resize or append if possible. Instead, pre-allocate arrays of sufficient size from the very beginning (even if somewhat larger than ultimately necessary).

Sign up to request clarification or add additional context in comments.

Comments

0
np.core.records.fromrecords(r.tolist()+[(5,'cc',43.)])

Still it does split, this time by rows. Maybe better?

1 Comment

@Paul, the question is: "is there a more efficient way to do this"?
0

Extending @unutbu's answer I post a more general function that appends any number of rows:

def append_rows(arrayIN, NewRows):
    """Append rows to numpy recarray.

    Arguments:
      arrayIN: a numpy recarray that should be expanded
      NewRows: list of tuples with the same shape as `arrayIN`

    Idea: Resize recarray in-place if possible.
    (only for small arrays reasonable)

    >>> arrayIN = np.array([(1, 'a', 1.1), (2, 'dd', 2.0), (3, 'x', 3.0)],
                           dtype=[('a', '<i4'), ('b', '|S3'), ('c', '<f8')])
    >>> NewRows = [(4, '12', 4.0), (5, 'cc', 43.0)]
    >>> append_rows(arrayIN, NewRows)
    >>> print(arrayIN)
    [(1, 'a', 1.1) (2, 'dd', 2.0) (3, 'x', 3.0) (4, '12', 4.0) (5, 'cc', 43.0)]

    Source: http://stackoverflow.com/a/1731228/2062965
    """
    # Calculate the number of old and new rows
    len_arrayIN = arrayIN.shape[0]
    len_NewRows = len(NewRows)
    # Resize the old recarray
    arrayIN.resize(len_arrayIN + len_NewRows, refcheck=False)
    # Write to the end of recarray
    arrayIN[-len_NewRows:] = NewRows

Comment

I want to stress that pre-allocation of an array, which is at least big enough, is the most reasonable solution (if you have an idea about the final size of the array)! Pre-allocation also saves you a lot of time.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.