1

With the help of "Complex matlab-like data structure in python (numpy/scipy)" I came up with:

s=(5,3)
a=np.zeros(s, dtype=[('Int1', int),
                     ('Int2', int),
                     ('Str1', '|S5')])

a[0,0]=(1,2,'abcde')
a[0,1]=((5,2,'fghij'),(7,9,'klmno'))

The problem is, that in some fields of my array a, just like in field a[0,1], I want to add one or more extra "information" just like in my code example. I don't know how many extra information I have to write into which part of my matrix, but I will always have to write tuples with the dtype=[(int, int, string)].

Of course, I get an error when I try to write into a[0,1] the way I do.

I would like to keep my matrix a 2-dimensional, but I would like to write several instances of my dtype=[int, int, str] into one field, similar to what I tried in field a[0,1].

Hopefully, I could explain my problem in a comprehensible way.

6
  • Do you need to have these extras infos "nested" on the same row of the matrix ? Coulnd't you do something like a[0,0]=(1,2,'abcde'); a[0,1]=(5,2,'fghij');a[0,2]=(7,9,'klmno') and know which row are linked (using a dict, or maybe an 4th element in your tuple as an identifier ?). Alternatively you can probably complete with 0/null values, like this : a[1]=[(5,2,'fghij'),(7,9,'klmno'), (0, 0, None)] Commented Jan 6, 2016 at 19:17
  • Hi, thank you for the question and suggestion! -- I could also use an identifier or another variable to reference. in case I use another cross-reference, I wouldn't need a 2-dimensional matrix anymore. I will have to write the fields of a[x,y] into a excel sheet, which is why I would prefer a 2-dimensional solution. Commented Jan 6, 2016 at 19:43
  • "I will have to write the fields of a[x,y] into a excel sheet, which is why I would prefer a 2-dimensional solution" - could you explain what you mean by this? I don't see how a 2D array would help in this case. Your data is still fundamentally "3D" in the sense that each individual "element" in the 2D array contains multiple values. Since a single cell can't contain two integers and a string, there's still no obvious way to represent a in a single 2D spreadsheet. Commented Jan 6, 2016 at 19:59
  • hi ali_m, I agree with you, there is not obvious way to represent a in a single 2D spreadsheet, since it has some 3D characteristics... But I have to fit it in a Excel sheet in this non-intuitive way: All the entries of for example a[0,1] will be in one excel field, and in this excel sheet I write them into a new line inside the same field... Thats how they want it! Commented Jan 7, 2016 at 9:21
  • Have you looked at pandas.DataFrame? I think it is more suited to this sort of structure, especially if you want to convert it to an excel spreadsheet Commented Jan 7, 2016 at 12:20

2 Answers 2

1

A numpy array is probably the wrong data structure for this kind of flexibility. Once created your array a takes up a fixed amount of memory. It has 15 (5*3) records, and each record contains the 2 ints and one string with 5 characters. You can modify values, but you can't add new records, or change one record into a composite of two records.

Lists give you the flexibility to add elements and to change their nature. A list contains pointers to objects located else where in memory.

An array of dtype=object behaves much like a list. Its data buffer is the same sort of pointers. a=np.zeros((3,5), dtype=object) is a 2d array, where each element can be a tuple, list, number, None, tuple of tuples, etc. But with that kind of array you loose a lot of the 2d numeric calculation abilities.

With your structured array, the only way to increase its size or add fields is to make a new array and copy data over. There are functions that assist in adding fields, but they do, in one way or other, what I just described.


With your definition, there are 3 fields, ['Int1','Int2','Str1']

a=np.zeros(s, dtype=[('Int1', int),
                     ('Int2', int),
                     ('Str1', '|S5')])

Increasing the number of fields (by that concept of fields) would be something like

a1=np.zeros(s, dtype=[('Int1', int),
                     ('Int2', int),
                     ('Str1', '|S5'),
                     ('Str2', '|S5')])

That is adding a field named 'Str2'. You could fill it with

for name in a.dtype.fields: a1[name] = a[name]

Now all records in a a2 have the same data as in a, but they also have a blank Str2 field. You could set that field for each element individually, or as group with:

a['Str2'] = ...

But your attempt to change A[0,1] into a tuple of tuples is quite different. It's like trying to replace an element of a regular numeric array with two numbers:

x = np.arange(10)
x[3] = [3,5]

That works for lists, x=range(10), but not for arrays.

Sign up to request clarification or add additional context in comments.

Comments

0

My code would look like this now:

s=(5,3)   
a=np.zeros(s, dtype=object)  
a[0,0]=(1,2,'abcde')  
a[0,1]=((5,2,'fghij'),(7,9,'klmno'))

I can see/access the entries with:

print(a[0,1])
print(a[0,1][0])
print(a[0,1][1])

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.