Python Numpy vs Matlab : Array assignment performance

Question

I've been writing Matlab code for many years and recently I have started writing in python. Let me try to explain the problem I am facing:

Some part of my code associates cells in a large array, let's say for the sake of the example an image of size 1080x1400, to a smaller array, a grid of size 770x700. All the cells in the large array could be associated with the whole grid or to a smaller section, meaning that a large number of cells in the large array could be associated with the same cell in the small array. I have written two sets of code, one in Matlab and the other in Python.

For some reason, the Matlab code runs in an average of 41 msec, while the Python code runs in an average of 4.1 sec in Pycharm (both measured 100 times). Is there anything I can do to substantially improve Numpy's performance?

Although I always write in a vectorized form, in this case, the code is written with a for loop, which I think is appropriate here.

Thanks

Links to Example Input Data:

https://technionmail-my.sharepoint.com/:x:/g/personal/barakp_campus_technion_ac_il/Eb-JELhUNslJm219qI6bflEBxEv3XnOsGTQaTZN7GfzUbA?e=CeUjRT

https://technionmail-my.sharepoint.com/:i:/g/personal/barakp_campus_technion_ac_il/ETOjjmtzedpBi9YMKfI7778Bz3also9U9acvosMM1gKK0w?e=cQ4afV

Matlab Code:

%%
clear;clc;
InputCoord = readmatrix('InputCoord.csv');
%%
Wx = InputCoord(:,3)' + 1;
Wy = InputCoord(:,4)' + 1;
OutMtx = zeros(770,770);

%%
fp_Row = InputCoord(:,1)' + 1;
fp_Col = InputCoord(:,2)' + 1;
DataMtx = single(imread('DataMtx.tif'))./255;
%%
number_of_times = 100;
t_stop = zeros(number_of_times,1);
for jj = 1:number_of_times
    N = 1;
    t_start = tic;
    for ii = 1:size(Wx,2)
        Wx_ind = Wx(ii);
        Wy_ind = Wy(ii);
        fp_Row_ind = fp_Row(ii);
        fp_Col_ind = fp_Col(ii);
        if ii>1 && (Wx(ii)~=Wx(ii-1) || Wy(ii)~=Wy(ii-1))
            N = 1;
        end
        
        OutMtx(Wx_ind, Wy_ind) = ((N-1)*OutMtx(Wx_ind, Wy_ind) + DataMtx(fp_Row_ind, fp_Col_ind))/N;
        N = N + 1;
    end
    
    t_stop(jj) = toc(t_start);
end

Python Code:

import numpy as np
import cv2
import time


InputCoord = np.genfromtxt('InputCoord.csv', delimiter=',')
number_of_coords = np.shape(InputCoord)[0]
Wx = InputCoord[:, 2].astype(dtype=np.int32).reshape((1, number_of_coords))
Wy = InputCoord[:, 3].astype(dtype=np.int32).reshape((1, number_of_coords))
OutMtx = np.zeros((770, 770))

fp_Row = InputCoord[:, 0].astype(dtype=np.int32).reshape((1, number_of_coords))
fp_Col = InputCoord[:, 1].astype(dtype=np.int32).reshape((1, number_of_coords))
DataMtx = cv2.imread('DataMtx.tif', -1).astype(dtype=np.float32) / 255
# print(f' DataMtx flags:{DataMtx.flags}')
DataMtxf = np.asarray(DataMtx, order='F')
number_of_times = 100
t_stop = np.zeros((1, number_of_times))
for jj in range(number_of_times):
    t_start = time.time()
    N = 1
    for ii in range(number_of_coords):
        Wx_ind = Wx[0, ii]
        Wy_ind = Wy[0, ii]
        fp_Row_ind = fp_Row[0, ii]
        fp_Col_ind = fp_Col[0, ii]
        if (ii > 1) and ((Wx[0, ii] != Wx[0, ii - 1]) or (Wy[0, ii] != Wy[0, ii - 1])):
            N = 1

        OutMtx[Wx_ind, Wy_ind] = ((N - 1) * OutMtx[Wx_ind, Wy_ind] + DataMtx[fp_Row_ind, fp_Col_ind]) / N
        N = N + 1
    t_stop[0, jj] = time.time() - t_start
print(f'mean update time = {np.mean(t_stop)}')

This may be overkill but if you really want to increase the performance of numpy then you can extend your Python code by using the C extension and create an optimized parallel implementation using C. I know this introduces more complexity but if your code is really dependent on being faster then this may be worth trying out. Here is a helpful post that may help you get started. medium.com/analytics-vidhya/… — A Webb
– A Webb, Commented Oct 11, 2020 at 21:19
Your minimal reproducible example should include example data: Wx,Wy,fp_Row,fp_Col,DataMtxf. Does your Python code do what you want? — wwii
– wwii, Commented Oct 11, 2020 at 21:39
MATLAB does some sort of jit compiling so you can get away with the ii iteration. numpy does not do this. Vectorize where possible (like I did in MATLAB years ago). Or use numba to create a compiled version. — hpaulj
– hpaulj, Commented Oct 11, 2020 at 22:13
You might be more help with a simplified version of the ii loop, with minimal reproducible example values. In numpy things lilke reshape((1, number_of_coords)) and Wx[0, ii] look like carryovers from MATLAB. They don't hurt performance, but they clutter the code. But iterative nature of N may be the biggest obstacle to speeding up code by using whole-array numpy operations ("vectorization"). I don't have a clear sense of what's happening with that. — hpaulj
– hpaulj, Commented Oct 12, 2020 at 1:02
shared in links - the example data in your minimal reproducible example should be in the question, we should not have to get it from an offsite resource. I concur with @hpaulj 's comments. — wwii
– wwii, Commented Oct 12, 2020 at 14:19

BarakP · Accepted Answer · 2020-10-13 12:08:15Z

2

Problem solved:

I've used numba with jit compiling and now the Python code runs in an average of 17 msec !!

Thanks

import numpy as np
import cv2
import time
import numba

@numba.jit(nopython=True)
def Pix2Grid_MovAvg(DataMtx, OutMtx, Wx, Wy, fp_Row, fp_Col, number_of_coords):
    N = 1

    for ii in range(number_of_coords):
        Wx_ind = Wx[0, ii]
        Wy_ind = Wy[0, ii]
        fp_Row_ind = fp_Row[0, ii]
        fp_Col_ind = fp_Col[0, ii]
        if (ii > 1) and ((Wx[0, ii] != Wx[0, ii - 1]) or (Wy[0, ii] != Wy[0, ii - 1])):
            N = 1

        OutMtx[Wx_ind, Wy_ind] = ((N - 1) * OutMtx[Wx_ind, Wy_ind] + DataMtx[fp_Row_ind, fp_Col_ind]) / N
        N += 1
    return OutMtx

edited Oct 13, 2020 at 12:08

answered Oct 12, 2020 at 18:02

BarakP

713 bronze badges

Sign up to request clarification or add additional context in comments.

Collectives™ on Stack Overflow

Python Numpy vs Matlab : Array assignment performance

1 Answer 1

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Related