0

Currently I am iterating over one array and for each value in this array I am looking for the closest value at the corresponding point in another array that is within a region surrounding the corresponding point.

In summary: For any point in an array, how far away from a corresponding point in another array do you need to go to get the same value.

The code seems to work well for small arrays, however I am working now with 1024x768 arrays, leading me to wait a long time for each run....

Any help or advice would be greatly appreciated as I have been on this for a while!!

Example matrix in format Im using: np.array[[1,2],[3,4]]

#Distance to agreement

#Used later to define a region of pixels around a corresponding point
#to iterate over:
DTA = 26

#To account for noise in pixels - doesnt have to find the exact value,   
#just one within +/-130 of it.
limit = 130

#Containers for all pixel value matches and also the smallest distance
#to pixel match
Dist = []
Dist_min = []   

#Continer matrix for gamma pass/fail values
Dist_to_agree = np.zeros((i_size,j_size))

#i,j indexes the reference matrix (x), ii,jj indexes the measured     
#matrix(y). Finds a match within the limits, 
#appends the distance to the match into Dist.
#Then find the minimum distance to a match for that pixel and append it 
#to dist_min


for i, k in enumerate(x):
    for j, l in enumerate(k):
#added 10 packing to y matrix, so need to shift it by 10 in i&j
        for ii in range((i+10)-DTA,(i+10)+DTA):  
            for jj in range((j+10)-DTA,(j+10)+DTA):


#If the pixel value is within a range to account for noise, 
#let it be "found"
                if (y[ii,jj]-limit) <= x[i,j] <= (y[ii,jj]+limit):
                    #Calculating distance
                    dist_eu = sqrt(((i)-(ii))**2 + ((j) - (jj))**2)
                    Dist.append(dist_eu)

#If a value cannot be found within the noise range, 
#append 10 = instant fail. 
                else:
                    Dist.append(10)
        try:
            Dist_min.append(min(Dist))
            Dist_to_agree[i,j] = min(Dist)
        except ValueError:
            pass

        #Need to reset container or previous values will also be
        #accounted for when finding minimum
        Dist = []

print Dist_to_agree
2
  • you seem to have an indentation error on the for jj in range(...) loop. Often this happens on Stackoverflow when copy/pasting code, and sometimes, the copy/paste renders that way here because you're mixing spaces and tabs in the source -- You might want to check :-) Commented Mar 23, 2015 at 17:09
  • For a start, try to evaluate the whole ii,jj window with numpy array operations, rather than point by point. Treat ii as vector of values. Commented Mar 23, 2015 at 18:07

4 Answers 4

1

First, you are getting the elements of x in k and l, but then throwing that away and indexing x again. So in place of x[i,j], you could just use l, which would be much faster (although l isn't a very meaningful name, something like xi and xij might be better).

Second, you are recomputing y[ii,jj]-limit and y[ii,jj]+limitevery time. If you have enough memory, you can-precomputer these:ym = y-limitandyp = y+limit`.

Third, appending to a list is slower than creating an array and setting the values for long lists vs. long arrays. You can also skip the entire else clause by pre-setting the default value.

Fourth, you are computing min(dist) twice, and further may be using the python version rather than the numpy version, the latter being faster for arrays (which is another reason to make dist and array).

However, the biggest speedup would be to vectorize the inner two loops. Here is my tests, with x=np.random.random((10,10)) and y=np.random.random((100,100)):

Your version takes 623 ms.

Here is my version, which takes 7.6 ms:

dta = 26
limit = 130

dist_to_agree = np.zeros_like(x)
dist_min = []

ym = y-limit
yp = y+limit
for i, xi in enumerate(x):
    irange = (i-np.arange(i+10-dta, i+10+dta))**2
    if not irange.size:
        continue

    ymi = ym[i+10-dta:i+10+dta, :]
    ypi = yp[i+10-dta:i+10+dta, :]

    for j, xij in enumerate(xi):
        jrange = (j-np.arange(j+10-dta, j+10+dta))**2
        if not jrange.size:
            continue

        ymij = ymi[:, j+10-dta:j+10+dta]
        ypij = ypi[:, j+10-dta:j+10+dta]

        imesh, jmesh = np.meshgrid(irange, jrange, indexing='ij')
        dist = np.sqrt(imesh+jmesh)
        dist[ymij > xij  or xij < ypij]  = 10

        mindist = dist.min()
        dist_min.append(mindist)
        dist_to_agree[i,j] = mindist

print(dist_to_agree)
Sign up to request clarification or add additional context in comments.

5 Comments

Also, I have been struggling and unable to get around this error: "ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()" ...for this line: dist[ymij > xij or xij < ypij] = 10. Would we have different versions of python?
Thanks for the help, I have learned a lot and it does run a lot faster now. However, the answer it gives not quite right, Do i understand correctly that these two lines take (i+/-dta), but the full rest of the line along (j)?: ymi = ym[i+10-dta:i+10+dta, :] ypi = yp[i+10-dta:i+10+dta,:] In a nutshell, i'm trying to take each [i,j] value in x, and look for where the closest value around y[i,j] in 2D the same value occurs. In an adjustable 2D distance from y[i,j]. Any advice on the code would be amazing! Sorry for the trouble in advance!
Yes, those lines do what you said, but these lines then do the same thing along j: ymij = ymi[:, j+10-dta:j+10+dta] ypij = ypi[:, j+10-dta:j+10+dta]. The combined result is the same as doing ymij = ymi[i+10-dta:i+10+dta, j+10-dta:j+10+dta] ypij = ypi[i+10-dta:i+10+dta, j+10-dta:j+10+dta]
I think I found the problem, try it now.
Finally worked through it all and got my version to work the way I wanted. Thanks for the support, I definitely learned a lot!
1

@Ciaran Meshgrid is kinda a vectorized equivalent of two nested loops. Below are two equivalent ways of calculating the dist. One with loops and one with meshgrid+numpy vector operations. The second one is six times faster.

DTA=5
i=100
j=200

 def func1():
        dist1=np.zeros((DTA*2,DTA*2))
        for ii in range((i+10)-DTA,(i+10)+DTA):  
            for jj in range((j+10)-DTA,(j+10)+DTA):
                dist1[ii-((i+10)-DTA),jj-((j+10)-DTA)] =sqrt(((i)-(ii))**2 + ((j) - (jj))**2)
        return dist1

    def func2():
        dist2=np.zeros((DTA*2,DTA*2))
        ii, jj = meshgrid(np.arange((i+10)-DTA,(i+10)+DTA), 
                          np.arange((j+10)-DTA,(j+10)+DTA))
        dist2=np.sqrt((i-ii)**2+(j-jj)**2)
        return dist2

This is how ii and jj matrices look after meshgrid operation

ii=
 [[105 106 107 108 109 110 111 112 113 114]
 [105 106 107 108 109 110 111 112 113 114]
 [105 106 107 108 109 110 111 112 113 114]
 [105 106 107 108 109 110 111 112 113 114]
 [105 106 107 108 109 110 111 112 113 114]
 [105 106 107 108 109 110 111 112 113 114]
 [105 106 107 108 109 110 111 112 113 114]
 [105 106 107 108 109 110 111 112 113 114]
 [105 106 107 108 109 110 111 112 113 114]
 [105 106 107 108 109 110 111 112 113 114]]
jj= 
 [[205 205 205 205 205 205 205 205 205 205]
 [206 206 206 206 206 206 206 206 206 206]
 [207 207 207 207 207 207 207 207 207 207]
 [208 208 208 208 208 208 208 208 208 208]
 [209 209 209 209 209 209 209 209 209 209]
 [210 210 210 210 210 210 210 210 210 210]
 [211 211 211 211 211 211 211 211 211 211]
 [212 212 212 212 212 212 212 212 212 212]
 [213 213 213 213 213 213 213 213 213 213]
 [214 214 214 214 214 214 214 214 214 214]]

1 Comment

That is actually very very useful for what I am doing, thanks for the explanation!
0

for loops are very slow in pure python and you have four nested loops which will be very slow. Cython does wonders to the for loop speed. You can also try vectorization. While I'm not sure I fully understand what you are trying to do, you may try to vectorize at last some of the operations. Especially the last two loops.

So instead of two ii,jj cycles over

y[ii,jj]-limit) <= x[i,j] <= (y[ii,jj]+limit)

you can do something like

ii, jj = meshgrid(np.arange((i+10)-DTA,(i+10)+DTA), np.arange((j+10)-DTA,(j+10)+DTA))
t=(y[(i+10)-DTA,(i+10)+DTA]-limit>=x[i,j]) & (y[(i+10)-DTA,(i+10)+DTA]+limit<=x[i,j])
Dist=np.sqrt((i-ii)**2)+(j-jj)**2))

np.min(Dist[t]) will have your minimum distance for element i,j

2 Comments

Could you break up the long meshgrid expression so it is more readable?
Thanks for the comment, I have been trying to see what your doing with meshgrid, any further breakdown of it would be great!
-1

The numbapro compiler offers gpu Acceleration. Unfortunately it isn't free.

http://docs.continuum.io/numbapro/

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.