Skip to main content
Filter by
Sorted by
Tagged with
0 votes
1 answer
324 views

I have been trying to install llama-cpp-python for windows 11 with GPU support for a while, and it just doesn't work no matter how I try. I installed the necessary visual studio toolkit packages, ...
MiszS's user avatar
  • 11
1 vote
0 answers
187 views

I am trying to set up local, high speed NLP but am failing to install the arm64 version of llama-cpp-python. Even when I run CMAKE_ARGS="-DLLAMA_METAL=on -DLLAMA_METAL_EMBED_LIBRARY=on" \ ...
Dennis Losett's user avatar
0 votes
0 answers
99 views

I am attempting to bundle a rag agent into a .exe. However on usage of the .exe i keep running into the same two problems. The first initial problem is with locating llama-cpp, which i have fixed. The ...
Arnab Mandal's user avatar
0 votes
0 answers
90 views

I want a dataset of common n-grams and their log likelihoods. Normally I would download the Google Books Ngram Exports, but I wonder if I can generate a better dataset using a large language model. ...
evashort's user avatar
0 votes
0 answers
258 views

I'm experiencing significant performance and output quality issues when running the LLaMA 13B model using the llama_cpp library on my laptop. The same setup works efficiently with the LLaMA 7B model. ...
Farzand Ali's user avatar
2 votes
1 answer
1k views

I want my llm chatbot to remember previous conversations even after restarting the program. It is made with llama cpp python and langchain, it has conversation memory of the present chat but obviously ...
QUARKS's user avatar
  • 29
0 votes
0 answers
180 views

I start llama cpp Python server with the command: python -m llama_cpp.server --model D:\Mistral-7B-Instruct-v0.3.Q4_K_M.gguf --n_ctx 8192 --chat_format functionary Then I run my Python script which ...
Jengi829's user avatar
0 votes
2 answers
856 views

code: from langchain_community.vectorstores import FAISS from langchain_community.embeddings import HuggingFaceEmbeddings from langchain import PromptTemplate from langchain_community.llms import ...
Ashish Sawant's user avatar
0 votes
1 answer
596 views

I'm trying to create a service using the llama3-70b model by combining langchain and llama-cpp-python on a server workstation. While the model works well with short prompts(question1, question2), it ...
bibiibibin's user avatar
0 votes
1 answer
642 views

I am using Mistral 77b-instruct model with llama-index and load the model using llamacpp, and when I am trying to run multiple inputs or prompts ( open 2 website and send 2 prompts) , and it give me ...
HelloALive's user avatar
0 votes
1 answer
244 views

I am trying to install llama-cpp-python on Windows 11. I have installed and set up the CMAKE_ARGS environment variable to point to the MinGW gcc.exe and g++.exe to compile C and C++, but am struggling ...
Leo Turoff's user avatar
1 vote
0 answers
866 views

I want to use llama-3 with llama-cpp-python and get main answer for user questions like llama-2. But answers generated by llama-3 not main answer like llama-2: Output: Hey! 👋 What can I help you ...
Dalipboy M's user avatar
3 votes
1 answer
4k views

I'm reaching out to the community for some assistance with an issue I'm encountering in llama.cpp. Previously, the program was successfully utilizing the GPU for execution. However, recently, it seems ...
Montassar Jaziri's user avatar
1 vote
0 answers
65 views

I created embeddings for only one document so far. But when I ask questions which might are in the context but are definitely not part of this single document I would expect an answer like "I ...
m1ch4's user avatar
  • 51
0 votes
0 answers
842 views

I am following the instructions from the official documentation on how to install llama-cpp with GPU support in Apple silicon Mac. Here is my Dockerfile: FROM python:3.11-slim WORKDIR /code RUN pip ...
Kristada673's user avatar
  • 3,764
1 vote
2 answers
7k views

I followed the instruction on https://llama-cpp-python.readthedocs.io/en/latest/install/macos/. My macOS version is Sonoma 14.4, and xcode-select is already installed (version: 15.3.0.0.1.1708646388). ...
ooyeon's user avatar
  • 81
0 votes
2 answers
4k views

I am trying to load embeddings like this.I changed the code to reflect the current version change in LlamaIndex but it shows up an attribute error. from llama_index.embeddings.huggingface import ...
Rahul_51's user avatar
1 vote
0 answers
340 views

I have built a RAG app with Llamacpp and Langserve and it generally works. However I can't find a way to stream my responses, which would be very important for the application. Here is my code: from ...
Maxl Gemeinderat's user avatar
0 votes
2 answers
751 views

I'm currently taking the DeepAI's Finetuning Coursera course and encountered a bug while trying to run one of their demonstrations locally in a Jupyter notebook. Environment: Python version: 3.11 ...
Hofbr's user avatar
  • 1,020
1 vote
3 answers
6k views

I struggled alot while enabling GPU on my 32GB Windows 10 machine with 4GB Nvidia P100 GPU during Python programming. My LLMs did not use the GPU of my machine while inferencing. After spending few ...
Umaima Tinwala's user avatar
0 votes
2 answers
2k views

I have put my application into a Docker and therefore I have created a requirements.txt file. Now I need to install llama-cpp-python for Mac, as I am loading my LLM with from langchain.llms import ...
Maxl Gemeinderat's user avatar
1 vote
0 answers
1k views

I've been building a RAG pipeline using the llama-cpp-python OpenAI compatible server functionality and have been working my way up from running on just a laptop to running this on a dedicated ...
jhthompson12's user avatar
1 vote
0 answers
347 views

I am running quantized llama-2 model from here. I am using 2 different machines. 11th Gen Intel(R) Core(TM) i7-1165G7 @ 2.80GHz 2.80 GHz 16.0 GB (15.8 GB usable) Inference time on this machine is ...
Muhammad Burhan's user avatar
4 votes
1 answer
3k views

I can install llama cpp with cuBLAS using pip as below: CMAKE_ARGS="-DLLAMA_CUBLAS=on" FORCE_CMAKE=1 pip install llama-cpp-python However, I don't know how to install it with cuBLAS when ...
KimuGenie's user avatar
1 vote
1 answer
1k views

I have the following code. I am trying to use the local llama2-chat-13B model. The instructions appear to be good but the final output is erroring out. import logging import sys from IPython.display ...
Birender Singh's user avatar