I ingested all docs and created a collection / embeddings using Chroma. I have a local directory db. Within db there is chroma-collections.parquet and chroma-embeddings.parquet. These are not empty. Chroma-collections.parquet when opened returns a collection name, uuid, and null metadata.
When I load it up later using langchain, nothing is here.
from langchain.vectorstores import Chroma
embeddings = HuggingFaceEmbeddings(model_name=embeddings_model_name)
CHROMA_SETTINGS = Settings(
chroma_db_impl='duckdb+parquet',
persist_directory='db',
anonymized_telemetry=False
)
db = Chroma(persist_directory='db', embedding_function=embeddings, client_settings=CHROMA_SETTINGS)
db.get() returns {'ids': [], 'embeddings': None, 'documents': [], 'metadatas': []}
I've tried lots of other alternate approaches online. E.g.
import chromadb
client = chromadb.Client(Settings(chroma_db_impl="duckdb+parquet",
persist_directory='./db'))
coll = client.get_or_create_collection("langchain", embedding_function=embeddings)
coll.count() returns 0
I'm expecting all the docs and embeddings to be available. What am I missing?