Creating a list of nearest neighbors using numpy array

Question

Based on a cutoff distance I want to create a list of points which are neighbors for a given point. I'm stuck how to add the ids of the points to a numpy array.

I read the 'coordinate' in a numpy array like below:

ID x y z

1. 1 1 0.000000 3.078000 0.554000
2. 2 0.000000 3.078000 2.170000
3. 3 2.666000 1.539000 0.554000
4. 4 2.666000 1.539000 2.170000
5. 5 1.774000 0.000000 3.324000
6. 6 2.712000 0.000000 7.207000

and so on...

I want the output list like this

ID neighbor1 neighbor2 ....neighbor5

1. 1 4 5 6 8 
2. 2 1 4 6 n/a
3. 3 5 1 n/a 2
4. etc.

so each position (1,2,3,4...so on) can have a maximum of 5 neighbors, if less than 5 I want to put n/a there. I have the following code so far.

#neighborlist

flist = glob.glob(filename)

for f in flist:
    load = np.loadtxt(f, usecols=(2,3,4))
    coordinate=np.array(load)

s = (len(coordinate),6)

Nlist = np.zeros(s)

rcut = 2.5 #cutoff distance for neighbors

for f in flist:
    load = np.loadtxt(f, usecols=(2,3,4))
    coordinate=np.array(load)

for i in range(len(coordinate)):
    idx = [i,]

    for j in range(len(coordinate)):
        if np.linalg.norm(coordinate[i]-coordinate[j]) < rcut and np.linalg.norm(coordinate[i]-coordinate[j]) > 0.1:
            idx.append(j)            
        else: 
            idx = idx
    while len(idx)<6:
       idx.append('n/a')
    print idx 
    Nlist[i,:]=idx

    print Nlist

for this i'm getting:

ValueError: could not convert string to float: n/a

as I have fixed the Nlist array size to (len(data),6) it cannot copy Nlist[i,:]=idx when the neighbors are less than 5 (the first element is the point itself). In this case is there a way to declare Nlist dynamically or make its size variable? I know its a tough question but there has to be a way around this problem. Thanks in advance.

Fabian Rost · Accepted Answer · 2015-08-14 12:34:52Z

It could be done like that:

import numpy as np
from collections import defaultdict

coordinate = np.array([[0.   ,  3.078,  0.554],
      [ 0.   ,  3.078,  2.17 ],
      [ 2.666,  1.539,  0.554],
      [ 2.666,  1.539,  2.17 ],
      [ 1.774,  0.   ,  3.324],
      [ 2.712,  0.   ,  7.207]])

rcut = 2.5
neighbors = defaultdict(list)

for ID1, ID2 in itertools.permutations(range(len(coordinate)), 2):
    if np.linalg.norm(coordinate[ID1]-coordinate[ID2]) < rcut:
        neighbors[ID1].append(ID2)
    else:
        neighbors[ID1].append('n/a')

for ID in neighbors:
    print ID, neighbors[ID]

Output:

0 [1, 'n/a', 'n/a', 'n/a', 'n/a']
1 [0, 'n/a', 'n/a', 'n/a', 'n/a']
2 ['n/a', 'n/a', 3, 'n/a', 'n/a']
3 ['n/a', 'n/a', 2, 4, 'n/a']
4 ['n/a', 'n/a', 'n/a', 3, 'n/a']
5 ['n/a', 'n/a', 'n/a', 'n/a', 'n/a']

Note that I used a dictionary for the output, which in my opinion makes more sense than a list. As your code does not read the IDs, but takes the line number as ID I also did it like that. As numpy starts numbering at 0, the IDs start at 0.

Please also note that your code would only evaluate the last file in the file list, because you overwrite the coordinates array for each file you read.

Why do you want to store the 'n/a' in your output and not simply have a list of varying length for different IDs?

Also note, that the code I suggested is not very efficient. You can use algorithms like cell lists instead to speed things up.

If you want to find the neighbours efficiently you might also be interested in the scipy.spatial module which can do that:

from scipy import spatial
kdtree = sp.spatial.KDTree(coordinate)
kdtree.query_pairs(rcut)

Output:

{(0, 1), (2, 3), (3, 4)}

This gives you all pairs of points which have a distance less than rcut. This is the essential information you need to build the output list you want.

Collectives™ on Stack Overflow

Creating a list of nearest neighbors using numpy array

1 Answer 1

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Related