Basically, I have a function that, for each row, sums one extra value above at a time until the sum reaches a given threshold. Once it reaches the given threshold, it takes the resulting slice index and uses it to return the mean of that slice of another column.
import numpy as np
#Random data:
values = np.random.uniform(0,10,300000)
values2 = np.random.uniform(0,10,300000)
output = [0]*len(values)
#Function that operates one one single row and returns the mean
def function(threshold,row):
slice_sum=0
i=1
while slice_sum < threshold:
slice_sum = values[row-i:row].sum()
i=i+1
mean = values2[row-i:row].mean()
return mean
#Loop to iterate the function row by row:
for i in range(15,len(values)): #let's just skip the first 15 values, otherwise the loop might get stuck. This issue is not prioritary though.
output[i] = function(40,i)
This is a simplified version of the loop. It might not look slow, but it is very slow for all intents and practical purposes. So I'm wondering if there's a faster way of achieving this without a for loop.
Thanks
outputto any certain length. Just useoutput = [function(40, i) for i in range(15, len(values))].