5

I have a numpy array where each element looks something like this:

['3' '1' '35' '0' '0' '8.05' '2']
['3' '1' '' '0' '0' '8.4583' '0']
['1' '1' '54' '0' '0' '51.8625' '2']

I would like to replace all empty strings like the ones in the second row above, with some default value like 0. How can I do this with numpy?

The ultimate goal is to be able to run this: S.astype(np.float), but I suspect the empty strings are causing problems in the conversion.

3
  • so these are numpy string arrays? Commented Apr 10, 2013 at 21:28
  • Yes. It's created using np.array Commented Apr 10, 2013 at 21:29
  • Can't the numpy arrays be used with list comprehensions? Commented Apr 10, 2013 at 21:36

3 Answers 3

19

If your array is t:

t[t=='']='0'

and then convert it.

Explanation:

t=='' creates a boolean array with the same shape as t that has a True value where the corresponding t value is an empty space. This boolean array is then used to assign '0' only to the appropriate indices in the original t.

Sign up to request clarification or add additional context in comments.

3 Comments

Cool, I didn't remember this construct, I was just about to post an answer with list comprehensions, but this is quite shorter!
@user1778770 use list comprehension only if you have a list. as long as you have a numpy array, you should take advantage of that.
I think you could make your answer even better by adding a link to a page where the construct you've exposed is documented. That's for those beginners who will wonder what's going on. I haven't found one
6

Here is an approach that uses map not that is does not produce the same data type as calling .astype():

def FloatOrZero(value):
    try:
        return float(value)
    except:
        return 0.0


print map(FloatOrZero, ['3', '1', '', '0', '0', '8.4583', '0'])

Outputs:

[3.0, 1.0, 0.0, 0.0, 0.0, 8.4583, 0.0]

It's possible that this approach will give you more flexibility to cleanup data but it could also be harder to reason about if you are wanting to work with a numpy.array.

1 Comment

are map, reduce faster than @askewchan's answer?
5

Just do this first:

s = np.array(['1', '0', ''])
s[s==''] = '0'

s.astype(float)
#array([ 1.,  0.,  0.])

2 Comments

is this faster than map?
I was curious so I wrote a toy example using cProfile and they are not exactly the same. I suspect the overall work is roughly the same but calling map creates a new array of float values while calling astype(float) keeps the original np.array and just lets you see each value as a float. I now think this answer is better and just as readable so I updated my answer so people don't blindly copy bad information.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.