0

I translated this code from matlab.It is part of a larger body of code But I would like to get some advice on how to vectorize this section in order to make it faster. my major concern is with the for loops and if statements. If possible I would like to write it without using an if else statement. (Jax is not able to jit an if conditional). Thanks

import numpy as np

num_rows = 5
num_cols = 20
smf = np.array([np.inf, 0.1, 0.1, 0.1, 0.1])
par_init = np.array([1,2,3,4,5])
lb = np.array([0.1, 0.1, 0.1, 0.1, 0.1])
ub = np.array([10, 10, 10, 10, 10])
par = np.broadcast_to(par_init[:,None],(num_rows,num_cols))

print(par.shape)

par0_col = np.zeros(num_rows*num_cols - (num_cols-1) * np.sum(np.isinf(smf)))
lb_col = np.zeros(num_rows*num_cols - (num_cols-1) * np.sum(np.isinf(smf)))
ub_col = np.zeros(num_rows*num_cols- (num_cols-1) * np.sum(np.isinf(smf)))


# First looping
k = 0
for i in range(num_rows):
    smf_1 = smf.copy()    
    if  smf_1[i] == np.inf:
        
        
        par0_col[k] = par[i, 0]
        lb_col[k] = lb[i]
        ub_col[k] = ub[i]
        k = k+1
        
    else:
        par0_col[k:k+num_cols] = par[i, :num_cols]
        lb_col[k:k+num_cols] = lb[i]
        ub_col[k:k+num_cols] = ub[i]
        k = k+num_cols


arr_1 = np.zeros(shape = (num_rows, num_cols))
arr_2 = np.zeros(shape = (num_rows, num_cols))


par_log = np.log10((par0_col - lb_col) / (1 - par0_col / ub_col))

# second looping
k = 0
for i in range(num_rows):
    smf_1 = smf.copy()
    if np.isinf(smf_1[i]):
        smf_1[i] = 0
        arr_1[i, :] = (par_log[k])
        arr_2[i, :] = 10**par_log[k]
        k = k+1
    
    else:
        arr_1[i, :] = par_log[k:k+num_cols]
    
        arr_2[i, :] = 10**par_log[k:k+num_cols]
        
        
        k = k+num_cols

# print(arr_1)

# [[0.         0.         0.         0.         0.         0.
#   0.         0.         0.         0.         0.         0.
#   0.         0.         0.         0.         0.         0.
#   0.         0.        ]
#  [0.37566361 0.37566361 0.37566361 0.37566361 0.37566361 0.37566361
#   0.37566361 0.37566361 0.37566361 0.37566361 0.37566361 0.37566361
#   0.37566361 0.37566361 0.37566361 0.37566361 0.37566361 0.37566361
#   0.37566361 0.37566361]
#  [0.61729996 0.61729996 0.61729996 0.61729996 0.61729996 0.61729996
#   0.61729996 0.61729996 0.61729996 0.61729996 0.61729996 0.61729996
#   0.61729996 0.61729996 0.61729996 0.61729996 0.61729996 0.61729996
#   0.61729996 0.61729996]
#  [0.81291336 0.81291336 0.81291336 0.81291336 0.81291336 0.81291336
#   0.81291336 0.81291336 0.81291336 0.81291336 0.81291336 0.81291336
#   0.81291336 0.81291336 0.81291336 0.81291336 0.81291336 0.81291336
#   0.81291336 0.81291336]
#  [0.99122608 0.99122608 0.99122608 0.99122608 0.99122608 0.99122608
#   0.99122608 0.99122608 0.99122608 0.99122608 0.99122608 0.99122608
#   0.99122608 0.99122608 0.99122608 0.99122608 0.99122608 0.99122608
#   0.99122608 0.99122608]]

4
  • 1
    There are many SO numpy questions about 'vectorization' or 'removing for loops'. Old MATLAB code also required this, before you were able to cheat and let the language do that for you. Commented Feb 7, 2022 at 1:43
  • Yea I know there are many questions about vectorization but I couldn't find any specific answers that could help my case. Commented Feb 7, 2022 at 1:48
  • That's because there isn't one simple dropin fix. There are various techniques to operate on multiple i or k at once. You/we have to spend time understanding the probkem, and then visualize working with 'whole' arrays. Commented Feb 7, 2022 at 2:06
  • Agree there are many SO posts on vectorization. It would be good to update the title with more specifics on this particular problem Commented Feb 7, 2022 at 3:16

1 Answer 1

1

This is challenging to fully vectorize, but I assume it is possible. Here is a start that removes the if/else logic for the first loop. The trick is precomputing the k values (smf_1 is trivial so I removed it):

num_rows = 5
num_cols = 20
smf = np.array([np.inf, 0.1, 0.1, 0.1, 0.1])
par_init = np.array([1,2,3,4,5])
lb = np.array([0.1, 0.1, 0.1, 0.1, 0.1])
ub = np.array([10, 10, 10, 10, 10])
par = np.broadcast_to(par_init[:,None],(num_rows,num_cols))

kvals = np.where(np.isinf(smf), 1, num_cols)
kvals = np.insert(kvals, 0, 0)
kvals = np.cumsum(kvals)

par0_col = np.zeros(num_rows*num_cols - (num_cols-1) * np.sum(np.isinf(smf)))
lb_col = np.zeros(num_rows*num_cols - (num_cols-1) * np.sum(np.isinf(smf)))
ub_col = np.zeros(num_rows*num_cols- (num_cols-1) * np.sum(np.isinf(smf)))
# First looping
for i in range(num_rows):
    par0_col[kvals[i]:kvals[i+1]] = par[i, :kvals[i+1]-kvals[i]]
    lb_col[kvals[i]:kvals[i+1]] = lb[i]
    ub_col[kvals[i]:kvals[i+1]] = ub[i]

Assume this same pattern should work for the second loop but didn't look too closely. If it is possible to fully vectorize this (i.e. remove the for loop), then this is the path forward as I understand it.

Sign up to request clarification or add additional context in comments.

1 Comment

Ahh. I get the logic. It works like a charm. I also extended it to the second loop. Gracias.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.