2

I would appreciate any help please :)

I'm trying to create a record array from 1d array of strings and 2d array of numbers (so I can use np.savetxt and dump it into a file). Unfortunately the docs aren't informative: np.core.records.fromarrays

>>> import numpy as np
>>> x = ['a', 'b', 'c']
>>> y = np.arange(9).reshape((3,3))
>>> print x
['a', 'b', 'c']
>>> print y
[[0 1 2]
 [3 4 5]
 [6 7 8]]
>>> records = np.core.records.fromarrays([x,y])
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/lib/python2.7/dist-packages/numpy/core/records.py", line 560, in fromarrays
    raise ValueError, "array-shape mismatch in array %d" % k
ValueError: array-shape mismatch in array 1

And the output I need is:

[['a', 0, 1, 2]
 ['b', 3, 4, 5]
 ['c', 6, 7, 8]]
3
  • x should be an array, right? Currently it is a list. Commented May 15, 2014 at 21:52
  • correct, for some reason I can't edit my post... Commented May 16, 2014 at 14:21
  • @unutbu answer was very helpful! It made me look for an even elegant solution to seperate 2d array to its columns and I've found this: records = np.core.records.fromarrays([x]+[row for row in y.transpose()]) Commented May 16, 2014 at 14:31

1 Answer 1

2

If all you wish to do is dump x and y to a CSV file, then it is not necessary to use a recarray. If, however, you have some other reason for wanting a recarray, here is how you could create it:

import numpy as np
import numpy.lib.recfunctions as recfunctions

x = np.array(['a', 'b', 'c'], dtype=[('x', '|S1')])
y = np.arange(9).reshape((3,3))
y = y.view([('', y.dtype)]*3)

z = recfunctions.merge_arrays([x, y], flatten=True)
# [('a', 0, 1, 2) ('b', 3, 4, 5) ('c', 6, 7, 8)]

np.savetxt('/tmp/out', z, fmt='%s')

writes

a 0 1 2
b 3 4 5
c 6 7 8

to /tmp/out.


Alternatively, to use np.core.records.fromarrays you would need to list each column of y separately, so the input passed to fromarrays is, as the doc says, a "flat list of arrays".

x = ['a', 'b', 'c']
y = np.arange(9).reshape((3,3))
z = np.core.records.fromarrays([x] + [y[:,i] for i in range(y.shape[1])])

Each item in the list passed to fromarrays will become one column of the resultant recarray. You can see this by inspecting the source code:

_array = recarray(shape, descr)

# populate the record array (makes a copy)
for i in range(len(arrayList)):
    _array[_names[i]] = arrayList[i]

return _array

By the way, you might want to use pandas here for the extra convenience (no mucking around with dtypes, flattening, or iterating over columns required):

import numpy as np
import pandas as pd

x = ['a', 'b', 'c']
y = np.arange(9).reshape((3,3))

df = pd.DataFrame(y)
df['x'] = x

print(df)
#    0  1  2  x
# 0  0  1  2  a
# 1  3  4  5  b
# 2  6  7  8  c

df.to_csv('/tmp/out')
# ,0,1,2,x
# 0,0,1,2,a
# 1,3,4,5,b
# 2,6,7,8,c
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.