RAG Knowledge Base

Local Retrieval-Augmented Generation chatbot · Python (FastAPI) + React 19 / TypeScript · v1.0.0

A local RAG chatbot where documents are ingested into a persistent ChromaDB vector store and users can ask natural-language questions answered exclusively from ingested content with source citations. The backend calls a local Ollama daemon for both embeddings and chat generation using llama3.2, with no cloud LLM dependency.

Generated 4 Jun 2026 · 20 files · 9 components · 5 flows

File Architecture

The full source tree as a layered graph — every file with its role, imports, exports and reverse dependencies.

System Design

Runtime topology across the five zones — client, edge, application, data and external services.

Flow Graph

The five most significant application flows, step by step — startup, auth, write, read and error recovery.

Technology

ComponentTechnologyVersionSource of Detection
Web framework (API)FastAPI>=0.111.0backend/requirements.txt
ASGI serveruvicorn[standard]>=0.30.1requirements.txt; main.py uvicorn.run
RAG orchestrationLangChain>=0.2.10requirements.txt; ingest.py imports
Vector storeChromaDB>=0.5.0requirements.txt; langchain Chroma
LLM / embeddings runtimeOllama (ChatOllama)>=0.2.0requirements.txt; main.py get_llm
Chat modelllama3.2main.py OLLAMA_MODEL
Embedding modelnomic-embed-textingest.py EMBEDDING_MODEL
PDF loadingpypdf>=4.2.0requirements.txt; PyPDFLoader
Schema / validationpydantic>=2.7.4requirements.txt; main.py BaseModel
UI libraryReact^19.2.6frontend/package.json
Language (frontend)TypeScript~6.0.2package.json devDependencies
Build tool / dev serverVite^8.0.12package.json; vite.config.ts
StylingTailwind CSS v4^4.3.0@tailwindcss/vite; src/style.css