0

Based on a cutoff distance I want to create a list of points which are neighbors for a given point. I'm stuck how to add the ids of the points to a numpy array.

I read the 'coordinate' in a numpy array like below:

ID x y z

1. 1 1 0.000000 3.078000 0.554000
2. 2 0.000000 3.078000 2.170000
3. 3 2.666000 1.539000 0.554000
4. 4 2.666000 1.539000 2.170000
5. 5 1.774000 0.000000 3.324000
6. 6 2.712000 0.000000 7.207000

and so on...

I want the output list like this

ID neighbor1 neighbor2 ....neighbor5

1. 1 4 5 6 8 
2. 2 1 4 6 n/a
3. 3 5 1 n/a 2
4. etc.

so each position (1,2,3,4...so on) can have a maximum of 5 neighbors, if less than 5 I want to put n/a there. I have the following code so far.

#neighborlist

flist = glob.glob(filename)

for f in flist:
    load = np.loadtxt(f, usecols=(2,3,4))
    coordinate=np.array(load)

s = (len(coordinate),6)

Nlist = np.zeros(s)

rcut = 2.5 #cutoff distance for neighbors

for f in flist:
    load = np.loadtxt(f, usecols=(2,3,4))
    coordinate=np.array(load)

for i in range(len(coordinate)):
    idx = [i,]

    for j in range(len(coordinate)):
        if np.linalg.norm(coordinate[i]-coordinate[j]) < rcut and np.linalg.norm(coordinate[i]-coordinate[j]) > 0.1:
            idx.append(j)            
        else: 
            idx = idx
    while len(idx)<6:
       idx.append('n/a')
    print idx 
    Nlist[i,:]=idx

    print Nlist

for this i'm getting:

ValueError: could not convert string to float: n/a

as I have fixed the Nlist array size to (len(data),6) it cannot copy Nlist[i,:]=idx when the neighbors are less than 5 (the first element is the point itself). In this case is there a way to declare Nlist dynamically or make its size variable? I know its a tough question but there has to be a way around this problem. Thanks in advance.

1 Answer 1

1

It could be done like that:

import numpy as np
from collections import defaultdict

coordinate = np.array([[0.   ,  3.078,  0.554],
      [ 0.   ,  3.078,  2.17 ],
      [ 2.666,  1.539,  0.554],
      [ 2.666,  1.539,  2.17 ],
      [ 1.774,  0.   ,  3.324],
      [ 2.712,  0.   ,  7.207]])

rcut = 2.5
neighbors = defaultdict(list)

for ID1, ID2 in itertools.permutations(range(len(coordinate)), 2):
    if np.linalg.norm(coordinate[ID1]-coordinate[ID2]) < rcut:
        neighbors[ID1].append(ID2)
    else:
        neighbors[ID1].append('n/a')

for ID in neighbors:
    print ID, neighbors[ID]

Output:

0 [1, 'n/a', 'n/a', 'n/a', 'n/a']
1 [0, 'n/a', 'n/a', 'n/a', 'n/a']
2 ['n/a', 'n/a', 3, 'n/a', 'n/a']
3 ['n/a', 'n/a', 2, 4, 'n/a']
4 ['n/a', 'n/a', 'n/a', 3, 'n/a']
5 ['n/a', 'n/a', 'n/a', 'n/a', 'n/a']

Note that I used a dictionary for the output, which in my opinion makes more sense than a list. As your code does not read the IDs, but takes the line number as ID I also did it like that. As numpy starts numbering at 0, the IDs start at 0.

Please also note that your code would only evaluate the last file in the file list, because you overwrite the coordinates array for each file you read.

Why do you want to store the 'n/a' in your output and not simply have a list of varying length for different IDs?

Also note, that the code I suggested is not very efficient. You can use algorithms like cell lists instead to speed things up.

If you want to find the neighbours efficiently you might also be interested in the scipy.spatial module which can do that:

from scipy import spatial
kdtree = sp.spatial.KDTree(coordinate)
kdtree.query_pairs(rcut)

Output:

{(0, 1), (2, 3), (3, 4)}

This gives you all pairs of points which have a distance less than rcut. This is the essential information you need to build the output list you want.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.