1

I would like to compute the average euclidean distance in a 2D dataset (xCoords , yCoords) but only between neighbouring points.

As an example:

xCoords = [[16.8742 10.7265 30.0538 10.4524 12.6483 15.5349 10.2094 28.6425 9.2882]]

yCoords = [[14.5835  6.0766 12.7006  4.3638  5.0318 14.2657  8.3131 15.8346 6.1746]]

I want to find the euclidean distance between the points but only of those which are adjacent points to each other. Is there maybe a numpy, scipy or sklearn function (or some other) for my task?

EDIT:

As an illustration:

What I want:

enter image description here

What I don't want:

enter image description here

I don't want to compute the euclidean distance from every single data point to ALL other ones.

4
  • 3
    What do you mean with adjacent points to each other? Commented May 9, 2021 at 20:35
  • Sorry, I've edited my post to make it clearer. Commented May 9, 2021 at 21:47
  • In your first picture, what rules out including the point at [0.9, 0.45] as "adjacent" to your base point at [0.8, 0.2]? What are the explicit criteria for saying points are/aren't considered adjacent? Commented May 9, 2021 at 21:57
  • That are also the things I am struggling to come up with. E.g. if I image a 2D grid of equally spaced points, determining the adjacent point of every individual point would be relatively easy to image and also to implement. However, this idea has to somehow be generalised to the example from above with an arbitrary distribution. Commented May 9, 2021 at 22:27

1 Answer 1

2

Sklearn has the eucliedean_distances function: https://scikit-learn.org/stable/modules/generated/sklearn.metrics.pairwise.euclidean_distances.html

>>> from sklearn.metrics.pairwise import euclidean_distances
>>> X = [[0, 1], [1, 1]]
>>> # distance between rows of X
>>> euclidean_distances(X, X)
array([[0., 1.],
       [1., 0.]])
>>> # get distance to origin
>>> euclidean_distances(X, [[0, 0]])
array([[1.        ],
       [1.41421356]])

The function return a matrix with the euclidean distance between each pair of coordinates and if that's not exactly what you need, you can filter this matrix with the rules you want to get those "adjacent distances".

Or maybe you need some kind of a clustering algorithm(like k-means) where it will divide the closest points in clusters and then get the average distance of each cluster.

Sign up to request clarification or add additional context in comments.

4 Comments

If you find the k-means may help you, here's how to get the distances from the data point for closest cluster datascience.stackexchange.com/a/41125
Thank you very much! I'll see if I can come up with a solution based on your suggestion, but that's definitely something to work with. If you are wondering what I mean by those "adjacent points" I tried to make it clear in the answer above to my original post.
PhilE, the point is that there is not a "straght right rule" for those points as far as i could undestand, you know what i mean? so I just could come up with the idea that you want the distance from the closest points to each other(and thats whats k-meand does), a unsupervised algorithms that tries to find the best groups of points, and then having those groups you can get interesting distances, like distances from the closest/far groups, distance between the point in the border of the groups... Hope this help you!
Yes, that helps a lot! Thank you!

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.