Read it into a structured array:
In [30]:
a=[('1A34', 'RBP', 0.0, 1.0, 0.0, 0.0, 0.0, 0.0),
('1A9N', 'RBP', 0.0456267, 0.0539268, 0.331932, 0.0464031, 4.41336e-06, 0.522107),
('1AQ3', 'RBP', 0.0444479, 0.201112, 0.268581, 0.0049757, 1.28505e-12, 0.480883),
('1AQ4', 'RBP', 0.0177232, 0.363746, 0.308995, 0.00169861, 0.0, 0.307837)]
np.array(a, dtype=('a10,a10,f4,f4,f4,f4,f4,f4'))
Out[30]:
array([('1A34', 'RBP', 0.0, 1.0, 0.0, 0.0, 0.0, 0.0),
('1A9N', 'RBP', 0.045626699924468994, 0.053926799446344376, 0.331932008266449, 0.04640309885144234, 4.413359874888556e-06, 0.5221070051193237),
('1AQ3', 'RBP', 0.044447898864746094, 0.20111200213432312, 0.26858100295066833, 0.004975699819624424, 1.2850499744171406e-12, 0.48088300228118896),
('1AQ4', 'RBP', 0.01772320084273815, 0.3637459874153137, 0.30899500846862793, 0.0016986100235953927, 0.0, 0.30783700942993164)],
dtype=[('f0', 'S10'), ('f1', 'S10'), ('f2', '<f4'), ('f3', '<f4'), ('f4', '<f4'), ('f5', '<f4'), ('f6', '<f4'), ('f7', '<f4')])
You can have all of them in object dtype:
In [46]:
np.array(a, dtype=object)
Out[46]:
array([['1A34', 'RBP', 0.0, 1.0, 0.0, 0.0, 0.0, 0.0],
['1A9N', 'RBP', 0.0456267, 0.0539268, 0.331932, 0.0464031,
4.41336e-06, 0.522107],
['1AQ3', 'RBP', 0.0444479, 0.201112, 0.268581, 0.0049757,
1.28505e-12, 0.480883],
['1AQ4', 'RBP', 0.0177232, 0.363746, 0.308995, 0.00169861, 0.0,
0.307837]], dtype=object)
but it is not ideal for the float values, also it may lead to undesired behaviors:
In [48]:
b=np.array(a, dtype=object)
b[0]+b[1] #addition for float values and concatenation for string values
Out[48]:
array(['1A341A9N', 'RBPRBP', 0.0456267, 1.0539268, 0.331932, 0.0464031,
4.41336e-06, 0.522107], dtype=object)
pandas is also an alternative:
In [43]:
import pandas as pd
print pd.DataFrame(a)
0 1 2 3 4 5 6 7
0 1A34 RBP 0.000000 1.000000 0.000000 0.000000 0.000000e+00 0.000000
1 1A9N RBP 0.045627 0.053927 0.331932 0.046403 4.413360e-06 0.522107
2 1AQ3 RBP 0.044448 0.201112 0.268581 0.004976 1.285050e-12 0.480883
3 1AQ4 RBP 0.017723 0.363746 0.308995 0.001699 0.000000e+00 0.307837
In [44]:
pd.DataFrame(a).dtypes
Out[44]:
0 object
1 object
2 float64
3 float64
4 float64
5 float64
6 float64
7 float64
dtype: object
and it allows columns to have different dtype
strandfloatin onearray? It can be done bystructured arraybut it is not the ideal solution. Normal array only allow one type (dtypeas it is known). Considering usingpandas?resultsis a generator, you will need to convert it to a list first. The reason is that numpy arrays need to know their size at creation time. If you know the number of elements inresults, then you can do something likea = numpy.empty((n, 8), dtype='object'), followed by:for i, row in enumerate(results): a[i] = row.numpy.fromiterfunction.fromiterreallocates the array for every new element unlesscountis specified. Edit: just looked at the source code and it seems to do a 50% growth at every new allocation, so it might not be as bad as I thought.