3

I have a NumPy array as follows:

supp = np.array([['A', '5', '0'], ['B', '3', '0'], ['C', '4', '0'], ['D', '1', '0'], ['E', '2', '0']])

Now, I want to update the row[2] as row[1]/6. I'm using..

for row in supp: row[2] = row[1].astype(int) / 6

But row[2] seems to remain unaffected..

>>> supp
array([['A', '5', '0'],
   ['B', '3', '0'],
   ['C', '4', '0'],
   ['D', '1', '0'],
   ['E', '2', '0']],  
  dtype='<U1')

I'm using Python 3.5.2 and NumPy 1.11.1.

Any help is appreciated. Thanks in advance

2
  • Hint: Take a look at the result of supp[0,0] = 5/6 Commented Aug 19, 2016 at 2:55
  • 2
    Trying to put strings and numbers in the same array is a bad idea. Depending on what you're doing, Pandas might have more suitable tools for your use case, or it might be better to just get rid of the first column and use an array of float dtype. Commented Aug 19, 2016 at 3:02

1 Answer 1

5

The problem is that an np.array has only one type which is automatically assumed to be strings supp.dtype == '|S1' since your input contains only strings of length 1. So numpy will automatically convert your updated inputs to strings of length 1, '0's in your case. Force it to be of generic type object and then it will be able to have both strings and ints or floats or anything else:

supp = np.array([['A', '5', '0'], ['B', '3', '0'], ['C', '4', '0'], ['D', '1', '0'], ['E', '2', '0']])
supp = supp.astype(object)

for row in supp:
    row[2] = int(row[1]) / 6

result:

[['A' '5' 0.8333333333333334]
 ['B' '3' 0.5]
 ['C' '4' 0.6666666666666666]
 ['D' '1' 0.16666666666666666]
 ['E' '2' 0.3333333333333333]]

alternatively you can also use the dtype '|Sn' with larger value of n:

supp = np.array([['A', '5', '0'], ['B', '3', '0'], ['C', '4', '0'], ['D', '1', '0'], ['E', '2', '0']])
supp = supp.astype('|S5')

for row in supp:
    row[2] = int(row[1]) / 6

result:

[['A' '5' '0.833']
 ['B' '3' '0.5']
 ['C' '4' '0.666']
 ['D' '1' '0.166']
 ['E' '2' '0.333']]

and in this case you are still having only strings if that is what you want.

Sign up to request clarification or add additional context in comments.

2 Comments

Thanks. Learned a new thing today. :)
Me too! ;) BTW, I totally second user2357112 's comment. You should probably think about it... Using dtype=object might be convenient but you'll loose a lot of speedup on numerical computations, which is usually the point of using numpy.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.