2

I am trying to preallocate an empty array and at the same time defining the data type with a size of 19x5 using the following code:

import numpy as np
arr=np.empty((19,5),dtype=[('a','|S1'),('b', 'f4'),('c', 'i'),('d', 'f4'),('e', 'f4')])

The result is somewhat unexpected, yielding a 19*5*5 array. However, trying:

arr=np.empty((19,1),dtype=[('a','|S1'),('b', 'f4'),('c', 'i'),('d', 'f4'),('e', 'f4')])

gives the proper length per row (5 fields), which apparently looks like a 1D array.

When I am trying to write this, only this formatting is allowed:

np.savetxt(file, arr, delimiter=',', fmt='%s')

This tells me I am dealing with a single string. Is there no way to get a 19x5 shaped structured array that is not flattened?

The main problem arises when writing this with savetxt. I want to have a csv file that has all the 5 column values. As this is handled as a string it gives the wrong output.

2
  • You can use pandas DataFrame which is generally better than numpy's structured array. If you are willing to explore that option, say so. I will provide some example based on above question. Commented Apr 8, 2016 at 19:34
  • Thank you @Hun . I looked into this before. Fortunately, I managed to complete the code using numpy's structured arrays. I let you know if help is required for pandas. Commented Apr 11, 2016 at 12:16

1 Answer 1

4

Typically the fields of a structured array replace the columns of a 2d array. Often people load a csv with genfromtxt and wonder why the result is 1d. As you found you can make a 2d array with a compound dtype, but each element will have multiple values - as specified by the dtype.

Normally you'd initialize that array with a 1d shape, e.g. (19,).

Note that you have to fill values by field or with a list of tuples.

I don't have experience using savetxt with a structured array, and can't run tests on this tablet. But there probably are SO questions that help.

savetxt iterates on an array, and writes fmt%tuple(row), where fmt is built from your input.

I'd suggest trying fmt='%s %s. %s. %s %s' - a % format for each field in the dtype. See its docs. Also I don't know if a (19,) array will behave better than a (19,1).

Experiment with formatting elements of your array. They should look like tuples to the formatter. If not try tolist() or tuple(A[0]).

Here's answer that is almost good enough to be a duplicate

https://stackoverflow.com/a/35209070/901925

 ab = np.zeros(names.size, dtype=[('var1', 'S6'), ('var2', float)])
 np.savetxt('test.txt', ab, fmt="%10s %10.3f")

===================

savetxt can only handle a 1d structured array, because of the tuple formatting.

Sign up to request clarification or add additional context in comments.

1 Comment

Very insightful answer. Thank you very much.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.