LLM Search

v1.0.0Search & Data Extractionstable

Querying local documents, powered by LLM

chatbotchromahydelangchain-pythonlarge-language-models
Share:
657
Stars
0
Downloads
0
Weekly
0/5

What is LLM Search?

LLM Search is a Model Context Protocol (MCP) server that allows AI assistants like Claude, Cursor, and VS Code to querying local documents, powered by llm

Querying local documents, powered by LLM

This server falls under the Search & Data Extraction category on MCPgee, the world's largest MCP server directory with 33,000+ servers.

Features

  • Querying local documents, powered by LLM

Use Cases

Query local documents with LLM
Chroma vector store support
Hybrid search with HyDE
snexus

Maintainer

LicenseMIT
Languagejupyter notebook
Versionv1.0.0
UpdatedMay 19, 2026
Statushealthy
Maintenanceactive

Works with

ClaudeOpenAIwindowsmacoslinux

Installation

Manual Installation

npx llm-search

Configuration

Configuration Details

Config File

claude_desktop_config.json

Performance

Response Metrics

Response Time< 200ms
ThroughputMedium

Resource Usage

Memory UsageLow
CPU UsageLow

How to Set Up and Use LLM Search

pyLLMSearch (llm-search) is a local RAG (Retrieval-Augmented Generation) system that exposes an MCP server, allowing AI assistants in Cursor, Windsurf, VS Code, or Claude to query your private document collections using natural language. It supports PDF, Markdown, and DOCX files and combines hybrid dense/sparse search, HyDE hypothetical document embeddings, cross-encoder re-ranking, and optional multi-querying to deliver highly relevant answers without sending your documents to the cloud. Developers and researchers use it to build a private knowledge base that their AI coding assistants can query directly.

Prerequisites

  • Python 3.10 or higher
  • pip or uv package manager
  • An LLM API key (OpenAI API key, or a local model via Ollama/LiteLLM)
  • A local folder of documents to index (PDF, Markdown, or DOCX files)
  • An MCP-compatible client such as Cursor, Windsurf, or VS Code with Copilot
1

Install pyLLMSearch

Install the package from PyPI using pip. It is recommended to use a virtual environment to avoid dependency conflicts.

python -m venv .venv
source .venv/bin/activate
pip install llmsearch
2

Create a documents configuration YAML

Create a YAML configuration file that tells pyLLMSearch where your documents are, which embedding model to use, and how to configure search. Use the sample templates from the repository as a starting point.

# docs_config.yaml
documents:
  - path: /path/to/your/documents
    extensions: [".pdf", ".md", ".docx"]

embeddings:
  model_name: sentence-transformers/all-MiniLM-L6-v2
  persist_dir: /path/to/embeddings-store

search:
  hyde_enabled: true
  reranking_enabled: true
3

Create an LLM model configuration YAML

Create a second YAML file that specifies which LLM to use for answering questions. You can use OpenAI, Azure OpenAI, or a local model via LiteLLM and Ollama.

# llm_config.yaml
model:
  type: openai
  model_name: gpt-4o
  api_key: sk-your-openai-api-key
4

Generate embeddings from your documents

Run the embedding generation step. This processes your documents and stores vector embeddings in the persist directory you configured. Re-run this step when you add new documents.

llmsearch index create --config docs_config.yaml
5

Start the MCP server

Launch pyLLMSearch as an SSE-based MCP server. Clients like Cursor, Windsurf, and VS Code can connect to it for RAG-powered document queries.

llmsearch app mcp --docs-config docs_config.yaml --llm-config llm_config.yaml
6

Configure your MCP client

Add the MCP server endpoint to your editor's MCP configuration. The server listens on localhost:8080 by default over SSE.

{
  "mcpServers": {
    "llm-search": {
      "url": "http://localhost:8080/sse"
    }
  }
}

LLM Search Examples

Client configuration

MCP client configuration for connecting to a running pyLLMSearch SSE server. The server must be started separately before the client connects.

{
  "mcpServers": {
    "llm-search": {
      "url": "http://localhost:8080/sse"
    }
  }
}

Prompts to try

Example prompts to use once pyLLMSearch is connected as an MCP server in your editor.

- "Search my documents for information about the authentication flow"
- "What does the onboarding guide say about setting up a new developer account?"
- "Find all mentions of the deprecation policy in my internal documentation"
- "Summarize the key points from my architecture decision records about the database choice"
- "What are the troubleshooting steps documented for connection timeout errors?"

Troubleshooting LLM Search

Embedding generation is slow or runs out of memory

Use a smaller embedding model such as sentence-transformers/all-MiniLM-L6-v2 instead of larger models. You can also reduce the chunk size in the documents configuration YAML. For large document sets, run the indexing step on a machine with more RAM or use the incremental update command to index new files only.

MCP server starts but the client cannot connect

Verify the server is listening by running `curl http://localhost:8080/sse`. Check that no firewall or port conflict is blocking port 8080. Ensure you are using the SSE endpoint URL (ending in /sse) and not the base URL in your client configuration.

Search results are irrelevant or the LLM gives wrong answers

Enable HyDE (hyde_enabled: true) and re-ranking (reranking_enabled: true) in your docs_config.yaml. Ensure your documents have been fully re-indexed after configuration changes by running `llmsearch index create` again. Check that the LLM model specified in llm_config.yaml has internet access or is available locally.

Frequently Asked Questions about LLM Search

What is LLM Search?

LLM Search is a Model Context Protocol (MCP) server that querying local documents, powered by llm It connects AI assistants to external tools and data sources through a standardized interface.

How do I install LLM Search?

Follow the installation instructions on the LLM Search GitHub repository. Clone the repo, install dependencies, and add the server config to your AI client.

Which AI clients work with LLM Search?

LLM Search works with all major MCP-compatible AI clients including Claude Desktop, Claude Code, Cursor, VS Code (GitHub Copilot), Windsurf, and Cline.

Is LLM Search free to use?

Yes, LLM Search is open source and available under the MIT license. You can use it freely in both personal and commercial projects.

Browse More Search & Data Extraction MCP Servers

Explore all search & data extraction servers available in the MCPgee directory. Each server includes setup guides for Claude, Cursor, and VS Code.

Quick Config Preview

{ "mcpServers": { "llm-search": { "command": "npx", "args": ["-y", "llm-search"] } } }

Add this to your claude_desktop_config.json or .cursor/mcp.json

Read the full setup guide →

Ready to use LLM Search?

Browse our complete directory of 33,000+ MCP servers, read setup guides for your editor, and start building with the Model Context Protocol.

33,000+ ServersFree & Open SourceStep-by-Step Guides