1

I´m completely new to NumPy and tried a textbook code. Unfortunately, at a certain size of calculations, the NumPy results get screwed up. Here´s the code:

import sys
from datetime import datetime
import numpy

def pythonsum(n):
    a = range(n)
    b = range(n)
    c = []
    for i in range(len(a)):
        a[i] = i**2
        b[i] = i**3
        c.append(a[i]+b[i])
    return c

def numpysum(n):
    a = numpy.arange(n) ** 2
    b = numpy.arange(n) ** 3
    c = a + b
    return c

size = int(sys.argv[1])
start = datetime.now()
c=pythonsum(size)
delta = datetime.now()-start
print "The last 2 elements of the sum",c[-2:]
print "PythonSum elapsed time in microseconds", delta.microseconds
start = datetime.now()
c=numpysum(size)
delta = datetime.now()-start
print "The last 2 elements of the sum",c[-2:]
print "NumPySum elapsed time in microseconds", delta.microseconds

Results get negative when size >= 1291 I´m working with python 2.6, MacOSX 10.6, NumPy 1.5.0 Any ideas?

4
  • Hi - try to add float( ) around your mathematical computations, so that i2 becomes float(i2) Commented Aug 27, 2012 at 8:50
  • numpy.arange(n) ** 2 is the problem. The python code is fine. As numpy.arange() creates a vector, I can´t use float() around it. Commented Aug 27, 2012 at 8:55
  • Gues I figured it out.... a = numpy.arange(n, dtype=numpy.uint64) does the trick. It´s the 32-bit integers, that resulted in the faulty results. But then: why is it a problem in NumPy, but not in native Python 2.6? Commented Aug 27, 2012 at 9:06
  • Why: Its a platform problem 64 vs 32 bit. The largest 32 bit integer is 231 where as your largest 64 bit integer is 263. You can tick the answer below as correct now :-) Commented Aug 27, 2012 at 9:13

2 Answers 2

1

Beginning Numpy 1.5 ?

Introductory example in "Time for Action - Adding Vectors" will only run on a 64-bit platform which permits long integers. Otherwise it will return the erroneous results:

The last 2 elements of the sum [-2143491644 -2143487647]

To solve this issue convert the integer in the power function to float, such that the floating value is forwarded. Result: a factor 10 speed up

$ python vectorsum.py 1000000

The last 2 elements of the sum [9.99995000008e+17, 9.99998000001e+17]

PythonSum elapsed time in microseconds 3 59013

The last 2 elements of the sum [ 9.99993999e+17 9.99996999e+17]

NumPySum elapsed time in microseconds 0 308598

The corrected example:

import sys

from datetime import datetime

import numpy

def numpysum(n):

a = numpy.arange(n) ** 2.

b = numpy.arange(n) ** 3.

c = a + b

return c

def pythonsum(n): a = range(n)

  b = range(n)

  c = []

  for i in range(len(a)):

      a[i] = i ** 2.     # notice the dot (!)

      b[i] = i ** 3.

      c.append(a[i] + b[i])

  return c

size = int(sys.argv[1])

start = datetime.now()

c = pythonsum(size)

delta = datetime.now() - start

print "The last 2 elements of the sum", c[-2:]

print "PythonSum elapsed time in microseconds", delta.seconds, delta.microseconds

start = datetime.now()

c = numpysum(size)

delta = datetime.now() - start

print "The last 2 elements of the sum", c[-2:]

print "NumPySum elapsed time in microseconds", delta.seconds, delta.microseconds

the code is available in pastebin here http://paste.ubuntu.com/1169976/

Sign up to request clarification or add additional context in comments.

3 Comments

would you be so kind as to format the code in your example? Thanks a lot in advance.
Ok, of course it´s a 32bit/64bit problem. But why is this affecting NumPy and NOT native Python in the same skript? I assumed that both Python AND NumPy are 32bit. Sorry to bother again... (so... You know the book? :-)
@Doc :-) Yep I am reviewing the book. I think you've got to put your interpretation of the problem on it's head: Numpy is executed in 32-bit python without checking what python version you are running; so when you ask a numpy for solving a problem, numpy will in turn ask python to solve it based on numpy's functions. However unless a developer of the numpy package explicitly double-check's when to add a floating number interpretation, well then python returns the wrong answer to numpy. Don't worry you will get used to spot these little things :-)
0

I think there's some confusion in this thread. The reason that the pure-Python, i.e. non-numpy, code works doesn't have anything to do with 32-bit vs 64-bit. It will work correctly on either: Python ints can be of arbitrary size. [There's a bit of an implementation detail in the background involving whether it calls something an int or a long but you don't have to worry about it, the conversion is seamless. That's why sometimes you'll see L at the end of a number.]

For example:

>>> 2**100
1267650600228229401496703205376L

On the other hand, numpy integer dtypes are fixed-precision, and will always fail for large enough numbers, regardless of how wide:

>>> for kind in numpy.int8, numpy.int16, numpy.int32, numpy.int64:
...     for power in 1, 2, 5, 20:
...         print kind, power, kind(10), kind(10)**power
... 
<type 'numpy.int8'> 1 10 10
<type 'numpy.int8'> 2 10 100
<type 'numpy.int8'> 5 10 100000
<type 'numpy.int8'> 20 10 -2147483648
<type 'numpy.int16'> 1 10 10
<type 'numpy.int16'> 2 10 100
<type 'numpy.int16'> 5 10 100000
<type 'numpy.int16'> 20 10 -2147483648
<type 'numpy.int32'> 1 10 10
<type 'numpy.int32'> 2 10 100
<type 'numpy.int32'> 5 10 100000
<type 'numpy.int32'> 20 10 1661992960
<type 'numpy.int64'> 1 10 10
<type 'numpy.int64'> 2 10 100
<type 'numpy.int64'> 5 10 100000
<type 'numpy.int64'> 20 10 7766279631452241920

You can get the same results from numpy as from pure Python by telling it to use the Python type, i.e. dtype=object, albeit at a significant performance hit:

>>> import numpy
>>> numpy.array([10])
array([10])
>>> numpy.array([10])**100
__main__:1: RuntimeWarning: invalid value encountered in power
array([-2147483648])
>>> numpy.array([10], dtype=object)
array([10], dtype=object)
>>> numpy.array([10], dtype=object)**100
array([ 10000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000], dtype=object)

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.