Knowledge RAG
🐍 🏠 🍎 🪟 🐧 - Local RAG system for Claude Code with hybrid search (BM25 + semantic), cross-en
What is Knowledge RAG?
Knowledge RAG is a Model Context Protocol (MCP) server that allows AI assistants like Claude, Cursor, and VS Code to 🐍 🏠 🍎 🪟 🐧 - local rag system for claude code with hybrid search (bm25 + semantic), cross-en
🐍 🏠 🍎 🪟 🐧 - Local RAG system for Claude Code with hybrid search (BM25 + semantic), cross-en
This server falls under the Knowledge & Memory category on MCPgee, the world's largest MCP server directory with 33,000+ servers.
Features
- MCP protocol support
Use Cases
Maintainer
Works with
Installation
Manual Installation
npx knowledge-ragConfiguration
Configuration Details
claude_desktop_config.json
Performance
Response Metrics
Resource Usage
How to Set Up and Use Knowledge RAG
Knowledge RAG is a local, privacy-preserving RAG (Retrieval-Augmented Generation) system for Claude Code and other MCP-compatible clients that combines BM25 keyword search with semantic vector embeddings and cross-encoder reranking to deliver high-quality hybrid document retrieval. It runs entirely on your machine using ChromaDB as the vector store and FastEmbed for embeddings — no cloud APIs or external services required. With 12 MCP tools covering document ingestion, search, CRUD operations, URL fetching, and retrieval quality evaluation, it lets Claude search your personal knowledge base of notes, PDFs, code files, and documentation to answer questions grounded in your own content.
Prerequisites
- Python 3.9 or later installed
- pip or pipx for installation
- An MCP-compatible client such as Claude Code (claude mcp add command) or Claude Desktop
- Sufficient disk space for the ChromaDB vector store and downloaded embedding models (FastEmbed downloads models on first run)
Install knowledge-rag via pip
Install the knowledge-rag package from PyPI. This pulls in ChromaDB, FastEmbed, and all other dependencies.
pip install knowledge-ragInitialize the knowledge base
Run the init command to set up the ChromaDB database and download the embedding model on first use.
knowledge-rag initAdd the server to Claude Code
Use the claude mcp add command to register the knowledge-rag server as a user-level MCP server in Claude Code.
claude mcp add knowledge-rag -s user -- npx -y knowledge-ragOr configure Claude Desktop manually
For Claude Desktop or other MCP clients, add the server to your claude_desktop_config.json using the Python executable from your virtual environment.
{
"mcpServers": {
"knowledge-rag": {
"command": "/path/to/venv/bin/python",
"args": ["-m", "mcp_server.server"]
}
}
}Index your documents
Ask Claude to index a directory or add specific documents to the knowledge base using the add_document or reindex_documents tools. Supported formats include Markdown, PDF, DOCX, and most code file types.
Search your knowledge base
Use the search_knowledge tool through Claude to query your indexed documents. Adjust hybrid_alpha between 0.0 (pure BM25 keyword) and 1.0 (pure semantic) to tune retrieval style for different query types.
Knowledge RAG Examples
Client configuration
Claude Desktop configuration for knowledge-rag using a Python virtual environment.
{
"mcpServers": {
"knowledge-rag": {
"command": "/Users/you/.venv/bin/python",
"args": ["-m", "mcp_server.server"],
"env": {
"KNOWLEDGE_RAG_DIR": "/Users/you/.knowledge-rag"
}
}
}
}Prompts to try
Example prompts that use the 12 knowledge-rag MCP tools through Claude.
- "Search my knowledge base for anything related to SQL injection techniques"
- "Add the document at https://owasp.org/www-project-top-ten/ to my knowledge base"
- "List all documents in the 'security' category of my knowledge base"
- "Find documents similar to the one about privilege escalation techniques"
- "Reindex all documents in my knowledge base to pick up recent changes"
- "What does my knowledge base say about setting up ChromaDB with persistent storage?"Troubleshooting Knowledge RAG
Embedding model download fails or times out on first run
FastEmbed downloads the embedding model on first use, which requires internet access and may take several minutes. If it fails, check your internet connection and run 'knowledge-rag init' again. The model is cached locally after the first download.
Second instance exits with code 75
When KNOWLEDGE_RAG_SINGLE_INSTANCE=1 is set, only one server can run against a given data directory. Stop any existing knowledge-rag processes before starting a new instance, or set KNOWLEDGE_RAG_SINGLE_INSTANCE=0 to allow multiple instances.
search_knowledge returns irrelevant results
Adjust the hybrid_alpha parameter: use 0.0 for exact keyword matching (good for code or specific identifiers), 0.5 for balanced retrieval, or 1.0 for conceptual/semantic similarity. Also run reindex_documents if documents were added outside of the MCP tools.
Frequently Asked Questions about Knowledge RAG
What is Knowledge RAG?
Knowledge RAG is a Model Context Protocol (MCP) server that 🐍 🏠 🍎 🪟 🐧 - local rag system for claude code with hybrid search (bm25 + semantic), cross-en It connects AI assistants to external tools and data sources through a standardized interface.
How do I install Knowledge RAG?
Follow the installation instructions on the Knowledge RAG GitHub repository. Clone the repo, install dependencies, and add the server config to your AI client.
Which AI clients work with Knowledge RAG?
Knowledge RAG works with all major MCP-compatible AI clients including Claude Desktop, Claude Code, Cursor, VS Code (GitHub Copilot), Windsurf, and Cline.
Is Knowledge RAG free to use?
Yes, Knowledge RAG is open source and available under the MIT license. You can use it freely in both personal and commercial projects.
Knowledge RAG Alternatives — Similar Knowledge & Memory Servers
Looking for alternatives to Knowledge RAG? Here are other popular knowledge & memory servers you can use with Claude, Cursor, and VS Code.
MemPalace
★ 52.6kA local AI memory system that stores all conversations verbatim and organizes them into navigable structures. It provides 19 MCP tools for AI assistants to search and retrieve past decisions, debugging sessions, and architecture debates automatically
Kratos
★ 25.7k🏛️ Memory System for AI Coding Tools - Never explain your codebase again. MCP server with perfect project isolation, 95.8% context accuracy, and the Four Pillars Framework.
Context Mode
★ 15.4kAn MCP server that preserves LLM context by intercepting large data outputs and returning only concise summaries or relevant sections. It enables efficient sandboxed code execution, file processing, and documentation indexing across multiple programm
Memu
★ 13.7kMemory for 24/7 proactive agents like OpenClaw.
MemOS
★ 9.3kMemOS (Memory Operating System) is a memory management operating system designed for AI applications. Its goal is: to enable your AI system to have long-term memory like a human, not only remembering what users have said but also actively invoking, u
Everos
★ 5.4kBuild, evaluate, and integrate long-term memory for self-evolving agents.
Browse More Knowledge & Memory MCP Servers
Explore all knowledge & memory servers available in the MCPgee directory. Each server includes setup guides for Claude, Cursor, and VS Code.
Set Up Knowledge RAG in Your Editor
Choose your AI client for step-by-step setup instructions.
Quick Config Preview
Add this to your claude_desktop_config.json or .cursor/mcp.json
Ready to use Knowledge RAG?
Browse our complete directory of 33,000+ MCP servers, read setup guides for your editor, and start building with the Model Context Protocol.