Numpy integer nan [duplicate]

Question

Is there a way to store NaN in a Numpy array of integers? I get:

a=np.array([1],dtype=long)
a[0]=np.nan

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ValueError: cannot convert float NaN to integer

Pierre GM · Accepted Answer · 2012-10-03 20:22:05Z

65

No, you can't, at least with current version of NumPy. A nan is a special value for float arrays only.

There are talks about introducing a special bit that would allow non-float arrays to store what in practice would correspond to a nan, but so far (2012/10), it's only talks.

In the meantime, you may want to consider the numpy.ma package: instead of picking an invalid integer like -99999, you could use the special numpy.ma.masked value to represent an invalid value.

a = np.ma.array([1,2,3,4,5], dtype=int)
a[1] = np.ma.masked
masked_array(data = [1 -- 3 4 5],
             mask = [False  True False False False],
       fill_value = 999999)

answered Oct 3, 2012 at 20:22

Pierre GM

20.5k3 gold badges58 silver badges67 bronze badges

Sign up to request clarification or add additional context in comments.

4 Comments

gaborous Over a year ago

But be aware that there is a huge performance cost to use masked arrays as they are implemented in pure python!

endolith Over a year ago

@gaborous Whoa, really? I thought they were the recommended way to do such things?

gaborous Over a year ago

@endolith Yes I found the info a long time ago in one of numpy's github issues but I don't have the link anymore. However since it was a long time ago, this might have been optimized (although I doubt so, one would need to compile to cython or similar first).

cwharris Over a year ago

Just to be clear, nan and null are not the same thing. Also, while it is not a direct substitute for numpy, cuDF does support nulls.

Julian · Accepted Answer · 2012-10-03 12:47:52Z

12

A nan is a floating point only thing, there is no representation of it in the integers, so no :)

Pick an invalid value, like -99999

answered Oct 3, 2012 at 12:47

Julian

8524 silver badges9 bronze badges

5 Comments

christang Over a year ago

Picking a canonical value as invalid wouldn't be a good solution as that wouldn't replicate the same properties as nan, namely: comparisons between nan and any other value including itself should be false.

cwharris Over a year ago

Using a sentinel value isn't ideal, but it's sufficient under the condition that you understand your data well enough to know the sentinel will not interfere with your computations. For instance, if you know your values are (not just "should be") always >= 0, then using a negative sentinel is acceptable (unless you're doing an operation where the outcome could have a different sign than the input, such as -1 * -1). If you're writing a framework and end up using sentinels, you should probably allow that value to be chosen by the user on an individual operation basis. Again, not ideal.

Gregory Morse Over a year ago

If your dataset is not going to change, then there are 2 easy ways that are closest to ideal: np.amin()-1 and np.amax()+1. Now your placeholder value is going to be unique, except in the case that np.amin() == np.iinfo(np.int32).min or np.amax()==np.iinfo(np.int32).max. In those cases, can use np.unique() and if the number of unique is equal to size of the data type, you must throw an error as no placeholder is possible. Otherwise search for the first value not in np.unique() efficiently by taking np.diff() and seeing the first place a difference is present, etc.

NoName Over a year ago

Sentinel values are actually used in a lot of real world databases especially in the healthcare industry such as Weight of Newborn Babies where -1 is used designate an unsuccessful birth.

Closed Limelike Curves Over a year ago

@NoName Yes, and that's bad. If I had a dollar for every bug caused by "Sentinel" values being used where a NaN or missing object should have been...

Collectives™ on Stack Overflow

Numpy integer nan [duplicate]

2 Answers 2

4 Comments

5 Comments

Linked

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

4 Comments

5 Comments

Linked

Related