0

I'm working on merging company data from 3 or more different providers. I'm exploring an entity resolution approach using separate embeddings for name, location, and domain, stored in vector indexes. I'm considering Google Spanner as a database option, however I'm not sure if this is even possible. I know you can do individual searches, e.g. I have a vector for name, give me the 10 closest names to it. But I want to ideally e.g. have 400 million firms, run an algorithm, end up with 100 million resolved firms.

Can this be done on Google Spanner Graph?

I've tried similarity search as per this article https://cloud.google.com/spanner/docs/find-k-nearest-neighbors however I need to do some form of KNN and clustering to obtain the nearest neighbours for every entity.

1
  • Please edit the question to limit it to a specific problem with enough detail to identify an adequate answer. Commented Jan 15 at 23:29

0

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.