Pure Python faster than Numpy? can I make this numpy code faster?

Question

I need to compute the min, max, and mean from a specific list of faces/vertices. I tried to optimize this computing with the use of Numpy but without success.

Here is my test case:

#!/usr/bin/python
# -*- coding: iso-8859-15 -*-
'''
Module Started 22 févr. 2013
@note: test case comparaison numpy vs python
@author: Python4D/damien
'''

import numpy as np
import time


def Fnumpy(vertices):
  np_vertices=np.array(vertices)
  _x=np_vertices[:,:,0]
  _y=np_vertices[:,:,1]
  _z=np_vertices[:,:,2]
  _min=[np.min(_x),np.min(_y),np.min(_z)]
  _max=[np.max(_x),np.max(_y),np.max(_z)]
  _mean=[np.mean(_x),np.mean(_y),np.mean(_z)]
  return _mean,_max,_min

def Fpython(vertices):
  list_x=[item[0] for sublist in vertices for item in sublist]
  list_y=[item[1] for sublist in vertices for item in sublist]
  list_z=[item[2] for sublist in vertices for item in sublist]
  taille=len(list_x)
  _mean=[sum(list_x)/taille,sum(list_y)/taille,sum(list_z)/taille]
  _max=[max(list_x),max(list_y),max(list_z)]
  _min=[min(list_x),min(list_y),min(list_z)]    
  return _mean,_max,_min

if __name__=="__main__":
  vertices=[[[1.1,2.2,3.3,4.4]]*4]*1000000
  _t=time.clock()
  print ">>NUMPY >>{} for {}s.".format(Fnumpy(vertices),time.clock()-_t)
  _t=time.clock()
  print ">>PYTHON>>{} for {}s.".format(Fpython(vertices),time.clock()-_t)

The results are:

Numpy:

([1.1000000000452519, 2.2000000000905038, 3.3000000001880174], [1.1000000000000001, 2.2000000000000002, 3.2999999999999998], [1.1000000000000001, 2.2000000000000002, 3.2999999999999998]) for 27.327068618s.

Python:

([1.100000000045252, 2.200000000090504, 3.3000000001880174], [1.1, 2.2, 3.3], [1.1, 2.2, 3.3]) for 1.81366938593s.

Pure Python is 15x faster than Numpy!

This line is slow np_vertices=np.array(vertices) . You're not really timing the min and max functions, you're timing how long it takes to sort out the nested references — YXD
– YXD, Commented Feb 22, 2013 at 14:36
You should edit your question to make what I think is your implied question, "can I make this numpy code faster?", explicit to fend of the close votes. — tacaswell
– tacaswell, Commented Feb 22, 2013 at 14:38
By using numpy constructs only (also for building vertices), you can speed up your code considerably. — Bálint Aradi
– Bálint Aradi, Commented Feb 24, 2013 at 7:08

Francis Avila · Accepted Answer · 2013-02-22 15:09:17Z

10

The reason your Fnumpy is slower is that it contains an additional step not done by Fpython: the creation of a numpy array in memory. If you move the line np_verticies=np.array(verticies) outside of Fnumpy and the timed section your results will be very different:

>>NUMPY >>([1.1000000000452519, 2.2000000000905038, 3.3000000001880174], [1.1000000000000001, 2.2000000000000002, 3.2999999999999998], [1.1000000000000001, 2.2000000000000002, 3.2999999999999998]) for 0.500802s.
>>PYTHON>>([1.100000000045252, 2.200000000090504, 3.3000000001880174], [1.1, 2.2, 3.3], [1.1, 2.2, 3.3]) for 2.182239s.

You can also speed up the allocation step significantly by providing a datatype hint to numpy when you create it. If you tell Numpy you have an array of floats, then even if you leave the np.array() call in the timing loop it will beat the pure python version.

If I change np_vertices=np.array(vertices) to np_vertices=np.array(vertices, dtype=np.float_) and keep it in Fnumpy, the Fnumpy version will beat Fpython even though it has to do a lot more work:

>>NUMPY >>([1.1000000000452519, 2.2000000000905038, 3.3000000001880174], [1.1000000000000001, 2.2000000000000002, 3.2999999999999998], [1.1000000000000001, 2.2000000000000002, 3.2999999999999998]) for 1.586066s.
>>PYTHON>>([1.100000000045252, 2.200000000090504, 3.3000000001880174], [1.1, 2.2, 3.3], [1.1, 2.2, 3.3]) for 2.196787s.

edited Feb 22, 2013 at 15:09

answered Feb 22, 2013 at 14:54

Francis Avila

31.8k7 gold badges63 silver badges99 bronze badges

Sign up to request clarification or add additional context in comments.

3 Comments

baco Over a year ago

I tried np_vertices=np.array(vertices, dtype=np.float_) or np_vertices=np.array(vertices, dtype=np.half), but no improvment... >>NUMPY >>([inf, inf, inf], [1.0996, 2.1992, 3.3008], [1.0996, 2.1992, 3.3008]) for 27.5570968929s. >>PYTHON>>([1.100000000045252, 2.200000000090504, 3.3000000001880174], [1.1, 2.2, 3.3], [1.1, 2.2, 3.3]) for 1.80307082548s.

Francis Avila Over a year ago

Are you sure? Because as you can see from my results, I saw a huge improvement. Using numpy 1.5.1 with Python 2.7.1 if that matters.

Francis Avila Over a year ago

The bigger point here, though, is that your numpy arrays should be created/allocated once and reused as much as possible rather than recreated inside of computation functions. Memory allocation takes a long time too and should be considered in any algorithm--it's not just computation that limits program speed.

Bálint Aradi · Accepted Answer · 2013-02-23 13:48:37Z

As already pointed out by others, your problem is the conversion from list to array. By using the appropriate numpy functions for that, you will beat Python. I modified the main part of your program:

if __name__=="__main__":
  _t = time.clock()
  vertices_np = np.resize(np.array([ 1.1, 2.2, 3.3, 4.4 ], dtype=np.float64), 
                          (1000000, 4, 4))
  print "Creating numpy vertices: {}".format(time.clock() - _t)
  _t = time.clock()
  vertices=[[[1.1,2.2,3.3,4.4]]*4]*1000000
  print "Creating python vertices: {}".format(time.clock() - _t)
  _t=time.clock()
  print ">>NUMPY >>{} for {}s.".format(Fnumpy(vertices_np),time.clock()-_t)
  _t=time.clock()
  print ">>PYTHON>>{} for {}s.".format(Fpython(vertices),time.clock()-_t)

Running your code with the modifed main part results on my machine in:

Creating numpy vertices: 0.6
Creating python vertices: 0.01
>>NUMPY >>([1.1000000000452519, 2.2000000000905038, 3.3000000001880174], 
[1.1000000000000001, 2.2000000000000002, 3.2999999999999998], [1.1000000000000001, 
2.2000000000000002, 3.2999999999999998]) for 0.5s.
>>PYTHON>>([1.100000000045252, 2.200000000090504, 3.3000000001880174], [1.1, 2.2, 3.3], 
[1.1, 2.2, 3.3]) for 1.91s.

Although the array creation is still somewhat longer with Numpy tools as the creation of the nested lists with python's list multiplication operator (0.6s versus 0.01s), you gain a factor of ca. 4 for the run-time relevant part of your code. If I replace the line:

np_vertices=np.array(vertices)

with

np_vertices = np.asarray(vertices)

to avoid the copying of a big array, the running time of the numpy function even goes down to 0.37s on my machine, being more than 5 times faster then the pure python version.

In your real code, if you know the number of vertices in advance, you can preallocate the appropriate array via np.empty(), then fill it with the appropriate data, and pass it to the numpy-version of your function.

Collectives™ on Stack Overflow

Pure Python faster than Numpy? can I make this numpy code faster?

2 Answers 2

3 Comments

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

3 Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related