Newest 'rag' Questions

Best practices

0 votes

1 replies

103 views

Regarding rag for telephony with deepgram

I'm building a voice-based calling system where users can create AI agents that make outbound phone calls. The agent uses Deepgram for real-time transcription and ElevenLabs/Cartesia for speech ...

Sarthak Sahu

1

asked Nov 15 at 9:35

0 votes

0 answers

22 views

How to exclude metadata from embedding?

I'm using LlamaIndex 0.14.7. I would like to embed document text without concatenating metadata, because I put a long text in metadata. Here's my code: table_vec_store: SimpleVectorStore = ...

Trams

421

asked Nov 6 at 9:31

1 vote

1 answer

115 views

Why does answer_relevancy return NaN when evaluating RAG with Ragas?

I’m trying to evaluate my Retrieval-Augmented Generation (RAG) pipeline using Ragas. . Here’s a complete version of my code: """# RAG Evaluation""" from datasets import ...

Chandima

11

asked Sep 25 at 8:51

1 vote

0 answers

50 views

Why does my LangChain RAG chatbot sometimes miss relevant chunks in semantic search?

I built a RAG chatbot using LangChain + ChromaDB + OpenAI embeddings. The pipeline works, but sometimes the chatbot doesn’t return the most relevant PDF content, even though it exists in the vector DB....

Naitik Mittal

11

asked Sep 21 at 15:20

0 votes

0 answers

24 views

RAG Pipeline Memory Leak - Vector Embeddings Not Releasing After Context Switch in Memo AI

Question: I'm building a memory-augmented AI system using RAG with persistent vector storage, but facing memory leaks and context contamination between sessions. Problem: Vector embeddings aren't ...

TensorMind

1

asked Sep 18 at 8:20

0 votes

0 answers

44 views

Why does LanceDB's full-text-search fail to find matches where the exact text is present?

I am trying to use lancedb to perform FTS, but getting spurious results. Here is a minimal example: # Data generation import lancedb import polars as pl from string import ascii_lowercase words = [...

MKWL

41

asked Sep 12 at 4:05

0 votes

0 answers

54 views

How to send extra headers from RAGFlow Agent to a Spring Boot MCP server tool call?

I am using RAGFlow connected to a Spring Boot MCP server. My agent flow is simple: Begin node → collects inputs (auth_token, tenant_id, x_request_status) Agent (gpt-4o) → connected to MCP Tool (server)...

Ishan Garg

729

asked Sep 4 at 17:45

0 votes

0 answers

75 views

How to accelerate my corpus embedding to the chromadb

I have the corpus.jsonl which has 6.5gb storage.And i use the one h100 gpu to embedding the corpus to the chromadb,but it seems very slowly.I want to find how can i accelerate the progress(gpu,cpu,io)....

YiJun Sachs

23

asked Aug 27 at 1:52

2 votes

1 answer

164 views

Why is FAISS document retrieval slow and inconsistent on EC2 t3.micro instance?

I'm building a document Q&A system using FAISS for vector search on an AWS EC2 t3.micro instance (1 vCPU, 1GB RAM). My FAISS index is relatively small (8.4MB .faiss + 1.4MB .pkl files), but I'm ...

user29255210

45

asked Aug 22 at 11:01

0 votes

1 answer

118 views

How do I prevent duplicate messages in context window, when using rag and memory?

When using rag and memory, multiple identical copies of the same information is sent to the ai, when asking related questions. I have import java.util.ArrayList; import java.util.List; import dev....

MTilsted

5,535

asked Jul 28 at 21:59

-1 votes

1 answer

250 views

ImportError: cannot import name 'Client' from 'pinecone' (unknown location)

The problem with this piece of code is that I am unable to import Client from the pinecone library. I tried to uninstalling and reinstalling different versions none of them worked. I also tried it ...

ACR

21

asked Jul 24 at 2:41

1 vote

0 answers

185 views

How to handle follow-up confirmations in Spring AI 1.0.0 without losing context during tool selection using RAG?

I'm building a web application using Spring Boot 3.4.5 and Spring AI 1.0.0 with Llama3.2(Ollama) model integration. I've implemented tool calling, and because I have many tools in the application, I'm ...

Sarath Molathoti

81

asked Jul 1 at 12:27

0 votes

1 answer

558 views

AttributeError: 'LlmAgent' object has no attribute 'invoke'

I am trying to call Flask API which i alrady running on port 5000 on my system, i am desgning a agentic AI code which will invoke GET and then POSt based on some condition , and using google-adk. I ...

witty_minds

79

asked Jun 20 at 6:18

1 vote

0 answers

55 views

Scaling RAG QA with Large Docs, Tables, and 30K+ Chunks (No LangChain)

I'm building a RAG-based document QA system using Python (no LangChain), LLaMA (50K context), PostgreSQL with pgvector, and Docling for parsing. Users can upload up to 10 large documents (300+ pages ...

Anton Lee

11

asked Jun 2 at 16:30

0 votes

0 answers

56 views

Using llama-index with the deployed LLM

I wanted to make a web app that uses llama-index to answer queries using RAG from specific documents. I have locally set up Llama3.2-1B-instruct llm and using that locally to create indexes of the ...

Utkarsh

1

asked May 29 at 11:17

0 votes

0 answers

169 views

Llamaindex returns "Empty Response"

I have a RAG system using llamaindex. I am upgrading library from 0.10.44 to 0.12.33. I see a different behaviour now. Before when there were not results from vectors store it seems it called the LLM ...

Deibys

669

asked May 6 at 14:16

1 vote

1 answer

134 views

Embedding model `all-mpnet-base-v2` not able to classify user prompt properly

I am using this model to embed a product catalog for a rag. In the product catalog, there are no red shirts for men, but there are red shirts for women. How can I make sure the model doesnt output ...

Advait Shendage

11

asked Apr 29 at 9:16

0 votes

2 answers

74 views

SitemapLoader(sitemap_url).load() hangs

from langchain_community.document_loaders import SitemapLoader def crawl(self): print("Starting crawler...") sitemap_url = "https://gringo.co.il/sitemap.xml" ...

Gulzar

28.7k

asked Apr 18 at 19:38

1 vote

0 answers

44 views

how to deal with evolving information in RAG?

I'm trying to index a series of articles to use in a RAG knowledge base, I cannot find any best practice or recommendation documented about dealing with information that changes or evolves in time. ...

weeanon

821

asked Apr 7 at 13:10

0 votes

0 answers

70 views

Firebase Genkit onCall function provide context from client app

I'm following along with the Firebase Genkit docs covering context. From reading the docs it seems as though I should be able to pass context to the flow from where I call the function in my client ...

Garrett

1,818

asked Apr 3 at 19:38

-4 votes

1 answer

141 views

I want to know where to locate the file I upload though the ragflow system, how to find it in the windows system

I use Ollama and RagFlow to manage my own knowledge files, I upload some files to a knowledge，and they works well in the system. I start the ragflow with docker commands. Who can help me to find the ...

Jinzhengxuan

13

asked Mar 19 at 1:22

1 vote

1 answer

257 views

RegexTextSplitter does not exist in langchain_text_splitters?

Trying to import RegexTextSplitter using from langchain.text_splitter import RegexTextSplitter ,RecursiveCharacterTextSplitter And I get the error from langchain.text_splitter import RegexTextSplitter ...

Dev_A

23

asked Mar 12 at 13:13

0 votes

1 answer

223 views

What to include in context precision/recall for RAG LLM evaluation

I am doing evaluation for my RAG LLM application using ragas. I have the prompt instruction to describe some rules, the retrieved content from my retriever, and chat history together for the LLM to do ...

Howie

111

asked Mar 6 at 7:21

0 votes

0 answers

40 views

I have a pipeline for the RAG application for the csv query search

from llama_index.core.query_pipeline import ( QueryPipeline as QP, Link, InputComponent, ) from llama_index.experimental.query_engine.pandas import ( PandasInstructionParser, ) from ...

Deep

11

asked Mar 3 at 13:13

1 vote

1 answer

516 views

Python request LM Studio Model failed but Curl successful

I tried to request local model by using Python with below code, import requests import json url = 'http://localhost:1234/v1/chat/completions' headers = { 'Content-Type': 'application/json' } ...

leo0807

1,576

asked Feb 21 at 6:06

Collectives™ on Stack Overflow

Regarding rag for telephony with deepgram

How to exclude metadata from embedding?

Why does answer_relevancy return NaN when evaluating RAG with Ragas?

Why does my LangChain RAG chatbot sometimes miss relevant chunks in semantic search?

RAG Pipeline Memory Leak - Vector Embeddings Not Releasing After Context Switch in Memo AI

Why does LanceDB's full-text-search fail to find matches where the exact text is present?

How to send extra headers from RAGFlow Agent to a Spring Boot MCP server tool call?

How to accelerate my corpus embedding to the chromadb

Why is FAISS document retrieval slow and inconsistent on EC2 t3.micro instance?

How do I prevent duplicate messages in context window, when using rag and memory?

ImportError: cannot import name 'Client' from 'pinecone' (unknown location)

How to handle follow-up confirmations in Spring AI 1.0.0 without losing context during tool selection using RAG?

AttributeError: 'LlmAgent' object has no attribute 'invoke'

Scaling RAG QA with Large Docs, Tables, and 30K+ Chunks (No LangChain)

Using llama-index with the deployed LLM

Llamaindex returns "Empty Response"

Embedding model `all-mpnet-base-v2` not able to classify user prompt properly

SitemapLoader(sitemap_url).load() hangs

how to deal with evolving information in RAG?

Firebase Genkit onCall function provide context from client app

I want to know where to locate the file I upload though the ragflow system, how to find it in the windows system

RegexTextSplitter does not exist in langchain_text_splitters?

What to include in context precision/recall for RAG LLM evaluation

I have a pipeline for the RAG application for the csv query search

Python request LM Studio Model failed but Curl successful

Hot Network Questions