Skip to content

Latest commit

 

History

History
91 lines (73 loc) · 2.24 KB

File metadata and controls

91 lines (73 loc) · 2.24 KB

Local RAG with Python and Flask

This application is designed to handle queries using a language model and a vector database. It generates multiple versions of a user query to retrieve relevant documents and provides answers based on the retrieved context.

Prerequisites

  1. Python 3: Ensure you have Python 3.x installed.
  2. Ollama: This app requires Ollama to be installed and running locally. Follow the Ollama installation guide to set it up.

Setup

  1. Clone the repository:
$ git clone https://github.com/firstpersoncode/local-rag.git
$ cd local-rag
  1. Create a virtual environment:
$ python -m venv venv
$ source venv/bin/activate

# For Windows user
# venv\Scripts\activate
  1. Install dependencies:
$ pip install -r requirements.txt
  1. Run Ollama: Ensure Ollama is installed and running locally. Refer to the Ollama documentation for setup instructions.
  • Start Ollama
$ ollama serve
  • Install llm model
$ ollama pull mistral
  • Install text embedding model
$ ollama pull nomic-embed-text
  1. Set up environment variables: Edit the .env.sample file and save it as .env
TEMP_FOLDER = './_temp'
CHROMA_PATH = "chroma"
COLLECTION_NAME = 'local-rag'
LLM_MODEL = 'mistral' # replace with the model you want to use.
TEXT_EMBEDDING_MODEL = 'nomic-embed-text'

Running the App

$ python app.py

Load the documents into the database.

$ curl --request POST \
  --url http://localhost:8080/embed \
  --header 'Content-Type: multipart/form-data' \
  --form file=@/path/to/pdf/document.pdf

# Response
{
  "message": "File embedded successfully"
}

Ask questions about the documents.

$ curl --request POST \
  --url http://localhost:8080/query \
  --header 'Content-Type: application/json' \
  --data '{ "query": "What is the document about?" }'

# Response
{
  "message": "The document is about...",
}

Conclusion

This app leverages a language model and a vector database to provide enhanced query handling capabilities. Ensure Ollama is running locally and follow the setup instructions to get started.