numpy beginner array plain python vs. numpy vectors: faulty results

Question

I´m completely new to NumPy and tried a textbook code. Unfortunately, at a certain size of calculations, the NumPy results get screwed up. Here´s the code:

import sys
from datetime import datetime
import numpy

def pythonsum(n):
    a = range(n)
    b = range(n)
    c = []
    for i in range(len(a)):
        a[i] = i**2
        b[i] = i**3
        c.append(a[i]+b[i])
    return c

def numpysum(n):
    a = numpy.arange(n) ** 2
    b = numpy.arange(n) ** 3
    c = a + b
    return c

size = int(sys.argv[1])
start = datetime.now()
c=pythonsum(size)
delta = datetime.now()-start
print "The last 2 elements of the sum",c[-2:]
print "PythonSum elapsed time in microseconds", delta.microseconds
start = datetime.now()
c=numpysum(size)
delta = datetime.now()-start
print "The last 2 elements of the sum",c[-2:]
print "NumPySum elapsed time in microseconds", delta.microseconds

Results get negative when size >= 1291 I´m working with python 2.6, MacOSX 10.6, NumPy 1.5.0 Any ideas?

Hi - try to add float( ) around your mathematical computations, so that i2 becomes float(i2) — root-11
– root-11, Commented Aug 27, 2012 at 8:50
numpy.arange(n) ** 2 is the problem. The python code is fine. As numpy.arange() creates a vector, I can´t use float() around it. — Doc
– Doc, Commented Aug 27, 2012 at 8:55
Gues I figured it out.... a = numpy.arange(n, dtype=numpy.uint64) does the trick. It´s the 32-bit integers, that resulted in the faulty results. But then: why is it a problem in NumPy, but not in native Python 2.6? — Doc
– Doc, Commented Aug 27, 2012 at 9:06
Why: Its a platform problem 64 vs 32 bit. The largest 32 bit integer is 231 where as your largest 64 bit integer is 263. You can tick the answer below as correct now :-) — root-11
– root-11, Commented Aug 27, 2012 at 9:13

root-11 · Accepted Answer · 2012-08-27 14:57:05Z

1

Beginning Numpy 1.5 ?

Introductory example in "Time for Action - Adding Vectors" will only run on a 64-bit platform which permits long integers. Otherwise it will return the erroneous results:

The last 2 elements of the sum [-2143491644 -2143487647]

To solve this issue convert the integer in the power function to float, such that the floating value is forwarded. Result: a factor 10 speed up

$ python vectorsum.py 1000000

The last 2 elements of the sum [9.99995000008e+17, 9.99998000001e+17]

PythonSum elapsed time in microseconds 3 59013

The last 2 elements of the sum [ 9.99993999e+17 9.99996999e+17]

NumPySum elapsed time in microseconds 0 308598

The corrected example:

import sys

from datetime import datetime

import numpy

def numpysum(n):
a = numpy.arange(n) ** 2.

b = numpy.arange(n) ** 3.

c = a + b

return c
def pythonsum(n): a = range(n)
  b = range(n)

  c = []

  for i in range(len(a)):

      a[i] = i ** 2.     # notice the dot (!)

      b[i] = i ** 3.

      c.append(a[i] + b[i])

  return c
size = int(sys.argv[1])

start = datetime.now()

c = pythonsum(size)

delta = datetime.now() - start

print "The last 2 elements of the sum", c[-2:]

print "PythonSum elapsed time in microseconds", delta.seconds, delta.microseconds

start = datetime.now()

c = numpysum(size)

delta = datetime.now() - start

print "The last 2 elements of the sum", c[-2:]

print "NumPySum elapsed time in microseconds", delta.seconds, delta.microseconds

the code is available in pastebin here http://paste.ubuntu.com/1169976/

edited Aug 27, 2012 at 14:57

answered Aug 27, 2012 at 9:12

root-11

1,8161 gold badge21 silver badges33 bronze badges

Sign up to request clarification or add additional context in comments.

3 Comments

Pierre GM Over a year ago

would you be so kind as to format the code in your example? Thanks a lot in advance.

Doc Over a year ago

Ok, of course it´s a 32bit/64bit problem. But why is this affecting NumPy and NOT native Python in the same skript? I assumed that both Python AND NumPy are 32bit. Sorry to bother again... (so... You know the book? :-)

root-11 Over a year ago

@Doc :-) Yep I am reviewing the book. I think you've got to put your interpretation of the problem on it's head: Numpy is executed in 32-bit python without checking what python version you are running; so when you ask a numpy for solving a problem, numpy will in turn ask python to solve it based on numpy's functions. However unless a developer of the numpy package explicitly double-check's when to add a floating number interpretation, well then python returns the wrong answer to numpy. Don't worry you will get used to spot these little things :-)

DSM · Accepted Answer · 2012-08-27 15:25:42Z

I think there's some confusion in this thread. The reason that the pure-Python, i.e. non-numpy, code works doesn't have anything to do with 32-bit vs 64-bit. It will work correctly on either: Python ints can be of arbitrary size. [There's a bit of an implementation detail in the background involving whether it calls something an int or a long but you don't have to worry about it, the conversion is seamless. That's why sometimes you'll see L at the end of a number.]

For example:

>>> 2**100
1267650600228229401496703205376L

On the other hand, numpy integer dtypes are fixed-precision, and will always fail for large enough numbers, regardless of how wide:

>>> for kind in numpy.int8, numpy.int16, numpy.int32, numpy.int64:
...     for power in 1, 2, 5, 20:
...         print kind, power, kind(10), kind(10)**power
... 
<type 'numpy.int8'> 1 10 10
<type 'numpy.int8'> 2 10 100
<type 'numpy.int8'> 5 10 100000
<type 'numpy.int8'> 20 10 -2147483648
<type 'numpy.int16'> 1 10 10
<type 'numpy.int16'> 2 10 100
<type 'numpy.int16'> 5 10 100000
<type 'numpy.int16'> 20 10 -2147483648
<type 'numpy.int32'> 1 10 10
<type 'numpy.int32'> 2 10 100
<type 'numpy.int32'> 5 10 100000
<type 'numpy.int32'> 20 10 1661992960
<type 'numpy.int64'> 1 10 10
<type 'numpy.int64'> 2 10 100
<type 'numpy.int64'> 5 10 100000
<type 'numpy.int64'> 20 10 7766279631452241920

You can get the same results from numpy as from pure Python by telling it to use the Python type, i.e. dtype=object, albeit at a significant performance hit:

>>> import numpy
>>> numpy.array([10])
array([10])
>>> numpy.array([10])**100
__main__:1: RuntimeWarning: invalid value encountered in power
array([-2147483648])
>>> numpy.array([10], dtype=object)
array([10], dtype=object)
>>> numpy.array([10], dtype=object)**100
array([ 10000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000], dtype=object)

Collectives™ on Stack Overflow

numpy beginner array plain python vs. numpy vectors: faulty results

2 Answers 2

3 Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

3 Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related