Vectorization using numpy

Question

I translated this code from matlab.It is part of a larger body of code But I would like to get some advice on how to vectorize this section in order to make it faster. my major concern is with the for loops and if statements. If possible I would like to write it without using an if else statement. (Jax is not able to jit an if conditional). Thanks

import numpy as np

num_rows = 5
num_cols = 20
smf = np.array([np.inf, 0.1, 0.1, 0.1, 0.1])
par_init = np.array([1,2,3,4,5])
lb = np.array([0.1, 0.1, 0.1, 0.1, 0.1])
ub = np.array([10, 10, 10, 10, 10])
par = np.broadcast_to(par_init[:,None],(num_rows,num_cols))

print(par.shape)

par0_col = np.zeros(num_rows*num_cols - (num_cols-1) * np.sum(np.isinf(smf)))
lb_col = np.zeros(num_rows*num_cols - (num_cols-1) * np.sum(np.isinf(smf)))
ub_col = np.zeros(num_rows*num_cols- (num_cols-1) * np.sum(np.isinf(smf)))


# First looping
k = 0
for i in range(num_rows):
    smf_1 = smf.copy()    
    if  smf_1[i] == np.inf:
        
        
        par0_col[k] = par[i, 0]
        lb_col[k] = lb[i]
        ub_col[k] = ub[i]
        k = k+1
        
    else:
        par0_col[k:k+num_cols] = par[i, :num_cols]
        lb_col[k:k+num_cols] = lb[i]
        ub_col[k:k+num_cols] = ub[i]
        k = k+num_cols


arr_1 = np.zeros(shape = (num_rows, num_cols))
arr_2 = np.zeros(shape = (num_rows, num_cols))


par_log = np.log10((par0_col - lb_col) / (1 - par0_col / ub_col))

# second looping
k = 0
for i in range(num_rows):
    smf_1 = smf.copy()
    if np.isinf(smf_1[i]):
        smf_1[i] = 0
        arr_1[i, :] = (par_log[k])
        arr_2[i, :] = 10**par_log[k]
        k = k+1
    
    else:
        arr_1[i, :] = par_log[k:k+num_cols]
    
        arr_2[i, :] = 10**par_log[k:k+num_cols]
        
        
        k = k+num_cols

# print(arr_1)

# [[0.         0.         0.         0.         0.         0.
#   0.         0.         0.         0.         0.         0.
#   0.         0.         0.         0.         0.         0.
#   0.         0.        ]
#  [0.37566361 0.37566361 0.37566361 0.37566361 0.37566361 0.37566361
#   0.37566361 0.37566361 0.37566361 0.37566361 0.37566361 0.37566361
#   0.37566361 0.37566361 0.37566361 0.37566361 0.37566361 0.37566361
#   0.37566361 0.37566361]
#  [0.61729996 0.61729996 0.61729996 0.61729996 0.61729996 0.61729996
#   0.61729996 0.61729996 0.61729996 0.61729996 0.61729996 0.61729996
#   0.61729996 0.61729996 0.61729996 0.61729996 0.61729996 0.61729996
#   0.61729996 0.61729996]
#  [0.81291336 0.81291336 0.81291336 0.81291336 0.81291336 0.81291336
#   0.81291336 0.81291336 0.81291336 0.81291336 0.81291336 0.81291336
#   0.81291336 0.81291336 0.81291336 0.81291336 0.81291336 0.81291336
#   0.81291336 0.81291336]
#  [0.99122608 0.99122608 0.99122608 0.99122608 0.99122608 0.99122608
#   0.99122608 0.99122608 0.99122608 0.99122608 0.99122608 0.99122608
#   0.99122608 0.99122608 0.99122608 0.99122608 0.99122608 0.99122608
#   0.99122608 0.99122608]]

There are many SO numpy questions about 'vectorization' or 'removing for loops'. Old MATLAB code also required this, before you were able to cheat and let the language do that for you. — hpaulj
– hpaulj, Commented Feb 7, 2022 at 1:43
Yea I know there are many questions about vectorization but I couldn't find any specific answers that could help my case. — richinex
– richinex, Commented Feb 7, 2022 at 1:48
That's because there isn't one simple dropin fix. There are various techniques to operate on multiple i or k at once. You/we have to spend time understanding the probkem, and then visualize working with 'whole' arrays. — hpaulj
– hpaulj, Commented Feb 7, 2022 at 2:06
Agree there are many SO posts on vectorization. It would be good to update the title with more specifics on this particular problem — frederick-douglas-pearce
– frederick-douglas-pearce, Commented Feb 7, 2022 at 3:16

frederick-douglas-pearce · Accepted Answer · 2022-02-07 03:14:33Z

1

This is challenging to fully vectorize, but I assume it is possible. Here is a start that removes the if/else logic for the first loop. The trick is precomputing the k values (smf_1 is trivial so I removed it):

num_rows = 5
num_cols = 20
smf = np.array([np.inf, 0.1, 0.1, 0.1, 0.1])
par_init = np.array([1,2,3,4,5])
lb = np.array([0.1, 0.1, 0.1, 0.1, 0.1])
ub = np.array([10, 10, 10, 10, 10])
par = np.broadcast_to(par_init[:,None],(num_rows,num_cols))

kvals = np.where(np.isinf(smf), 1, num_cols)
kvals = np.insert(kvals, 0, 0)
kvals = np.cumsum(kvals)

par0_col = np.zeros(num_rows*num_cols - (num_cols-1) * np.sum(np.isinf(smf)))
lb_col = np.zeros(num_rows*num_cols - (num_cols-1) * np.sum(np.isinf(smf)))
ub_col = np.zeros(num_rows*num_cols- (num_cols-1) * np.sum(np.isinf(smf)))
# First looping
for i in range(num_rows):
    par0_col[kvals[i]:kvals[i+1]] = par[i, :kvals[i+1]-kvals[i]]
    lb_col[kvals[i]:kvals[i+1]] = lb[i]
    ub_col[kvals[i]:kvals[i+1]] = ub[i]

Assume this same pattern should work for the second loop but didn't look too closely. If it is possible to fully vectorize this (i.e. remove the for loop), then this is the path forward as I understand it.

answered Feb 7, 2022 at 3:14

frederick-douglas-pearce

5064 silver badges10 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

richinex Over a year ago

Ahh. I get the logic. It works like a charm. I also extended it to the second loop. Gracias.

Collectives™ on Stack Overflow

Vectorization using numpy

1 Answer 1

1 Comment

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

1 Comment

Your Answer

Sign up or log in

Post as a guest

Related