Lumen

v1.0.0Coding Agentsstable

Reduce Claude Code, Codex, OpenCode wall clock and token use by 50% with open source, local semantic search. Works for small and large codebases and monorepos! Enterprise-ready and fully compliant via Ollama and SQLite-vec.

agentic-codingclaudeclaude-aiclaude-codeclaude-pl
Share:
197
Stars
0
Downloads
0
Weekly
0/5

What is Lumen?

Lumen is a Model Context Protocol (MCP) server that allows AI assistants like Claude, Cursor, and VS Code to reduce claude code, codex, opencode wall clock and token use by 50% with open source, local semantic search. works for small and large codebases and monorepos! enterprise-ready and fully compliant via...

Reduce Claude Code, Codex, OpenCode wall clock and token use by 50% with open source, local semantic search. Works for small and large codebases and monorepos! Enterprise-ready and fully compliant via Ollama and SQLite-vec.

This server falls under the Coding Agents and Developer Tools categories on MCPgee, the world's largest MCP server directory with 33,000+ servers.

Features

  • Reduce Claude Code, Codex, OpenCode wall clock and token use

Use Cases

Reduce token usage by 50% with local semantic code search.
Enable efficient codebase navigation for large monorepos.
Speed up Claude Code and IDE AI assistants with faster context retrieval.
ory

Maintainer

LicenseNOASSERTION
Languagego
Versionv1.0.0
UpdatedMay 21, 2026
Statushealthy
Maintenanceactive

Works with

ClaudeOpenAIwindowsmacoslinux

Installation

Manual Installation

npx lumen

Configuration

Configuration Details

Config File

claude_desktop_config.json

Performance

Response Metrics

Response Time< 200ms
ThroughputMedium

Resource Usage

Memory UsageLow
CPU UsageLow

How to Set Up and Use Lumen

Lumen is a local semantic code search MCP server that integrates with Claude Code, Cursor, Codex, and OpenCode to dramatically reduce token usage and wall-clock time. It builds a local vector index of your codebase using Ollama or LM Studio for embeddings, then exposes a semantic_search tool so AI assistants retrieve precisely relevant code without scanning entire files. Benchmarks show an average 26% cost reduction, 28% faster runs, and 37% fewer output tokens while maintaining patch quality — making it especially valuable for large codebases and monorepos.

Prerequisites

  • Ollama or LM Studio installed and running locally
  • Embedding model pulled: run `ollama pull ordis/jina-embeddings-v2-base-code`
  • Claude Code, Cursor, Codex, or OpenCode as the AI agent/MCP client
  • Supported language codebase (Go, Python, TypeScript, JavaScript, Rust, Java, C#, and more)
  • Linux, macOS, or Windows operating system
1

Install Ollama and pull the embedding model

Lumen relies on a local embedding server. Install Ollama from ollama.com, then pull the recommended code embedding model. This model runs entirely on your machine — no API keys required.

ollama pull ordis/jina-embeddings-v2-base-code
2

Install Lumen in Claude Code

For Claude Code users, install via the plugin marketplace. This registers Lumen as an MCP skill and makes its tools available in your coding sessions.

/plugin marketplace add ory/claude-plugins
/plugin install lumen@ory
3

Verify connectivity with the doctor command

After installation, run the health check to confirm Lumen can reach your Ollama server and that the embedding model is available. Fix any issues reported before indexing.

/lumen:doctor
4

Index your codebase

Trigger an initial index of your project. Lumen chunks your source files semantically (by language-aware AST boundaries) and stores vector embeddings in a local SQLite-vec database. For large monorepos this may take a few minutes.

/lumen:reindex
5

Configure environment variables for custom models

If you want to use a higher-quality embedding model or point to a non-default server, set these environment variables in your shell profile or MCP client config.

export LUMEN_EMBED_MODEL=qwen3-embedding:8b
export LUMEN_BACKEND=ollama
export OLLAMA_HOST=http://localhost:11434
export LUMEN_MAX_CHUNK_TOKENS=512
6

Use semantic search from the CLI (optional)

You can also search directly from the terminal to test that indexing is working correctly before relying on it inside your AI assistant.

lumen search --model ordis/jina-embeddings-v2-base-code "authentication handler"

Lumen Examples

Client configuration (Cursor)

Add Lumen to Cursor's MCP config. The server runs via the Ollama backend by default — no API keys needed.

{
  "mcpServers": {
    "lumen": {
      "command": "npx",
      "args": ["lumen"],
      "env": {
        "LUMEN_EMBED_MODEL": "ordis/jina-embeddings-v2-base-code",
        "LUMEN_BACKEND": "ollama",
        "OLLAMA_HOST": "http://localhost:11434"
      }
    }
  }
}

Prompts to try

Once Lumen is running, it automatically enhances code search in your AI assistant. Try these natural-language queries instead of grepping files manually.

- "Find all places where authentication tokens are validated"
- "Show me the database connection pooling logic"
- "Where is the rate limiting middleware defined?"
- "Find code related to user session management"
- "Search for error handling in the payment processing flow"

Troubleshooting Lumen

semantic_search returns empty results or /lumen:doctor shows embedding server unreachable

Ensure Ollama is running (`ollama serve`) and the embedding model is downloaded (`ollama list`). Check OLLAMA_HOST is correct (default: http://localhost:11434). Run `/lumen:reindex` to rebuild the index after fixing connectivity.

Indexing is slow or stalls on a large monorepo

Increase LUMEN_MAX_CHUNK_TOKENS to reduce the number of chunks, or switch to a smaller embedding model like `ordis/jina-embeddings-v2-base-code` (768 dims). You can also scope indexing to a subdirectory: `lumen index ./src`.

Token savings not noticeable after installation

Confirm the semantic_search tool is being called by the agent (check Claude Code logs). Run `/lumen:doctor` to verify health. If using Cursor, ensure the MCP server entry is active and not disabled in settings.

Frequently Asked Questions about Lumen

What is Lumen?

Lumen is a Model Context Protocol (MCP) server that reduce claude code, codex, opencode wall clock and token use by 50% with open source, local semantic search. works for small and large codebases and monorepos! enterprise-ready and fully compliant via ollama and sqlite-vec. It connects AI assistants to external tools and data sources through a standardized interface.

How do I install Lumen?

Follow the installation instructions on the Lumen GitHub repository. Clone the repo, install dependencies, and add the server config to your AI client.

Which AI clients work with Lumen?

Lumen works with all major MCP-compatible AI clients including Claude Desktop, Claude Code, Cursor, VS Code (GitHub Copilot), Windsurf, and Cline.

Is Lumen free to use?

Yes, Lumen is open source and available under the NOASSERTION license. You can use it freely in both personal and commercial projects.

Browse More Coding Agents MCP Servers

Explore all coding agents servers available in the MCPgee directory. Each server includes setup guides for Claude, Cursor, and VS Code.

Quick Config Preview

{ "mcpServers": { "lumen": { "command": "npx", "args": ["-y", "lumen"] } } }

Add this to your claude_desktop_config.json or .cursor/mcp.json

Read the full setup guide →

Ready to use Lumen?

Browse our complete directory of 33,000+ MCP servers, read setup guides for your editor, and start building with the Model Context Protocol.

33,000+ ServersFree & Open SourceStep-by-Step Guides