How do I install Headroom MCP Server?

Follow the setup instructions on the Headroom GitHub repository, then add the server configuration to your AI client.

What category is Headroom MCP Server?

Headroom is categorized under Knowledge & Memory. Browse more servers in these categories on MCPgee.

Headroom

Name: Headroom MCP Server
Author: chopratejas

v1.0.0•Knowledge & Memory•stable

Compress tool outputs, logs, files, and RAG chunks before they reach the LLM. 60-95% fewer tokens, same answers. Library, proxy, MCP server.

agentaianthropicclaude-codecompression

1,935

Stars

Downloads

Weekly

0/5

View on GitHub

What is Headroom?

Headroom is a Model Context Protocol (MCP) server that allows AI assistants like Claude, Cursor, and VS Code to compress tool outputs, logs, files, and rag chunks before they reach the llm. 60-95% fewer tokens, same answers. library, proxy, mcp server.

Compress tool outputs, logs, files, and RAG chunks before they reach the LLM. 60-95% fewer tokens, same answers. Library, proxy, MCP server.

This server falls under the Knowledge & Memory category on MCPgee, the world's largest MCP server directory with 33,000+ servers.

Features

Compress tool outputs, logs, files, and RAG chunks before th

Use Cases

Compress tool outputs and logs for LLMs

60-95% token reduction with same quality answers

chopratejas

Maintainer

LicenseApache-2.0

Languagepython

Versionv1.0.0

UpdatedMay 22, 2026

Statushealthy

Maintenanceactive

Works with

ClaudeOpenAIwindowsmacoslinux

View Source Browse All Servers

Installation

Manual Installation

npx headroom

Configuration

Configuration Details

Config File

claude_desktop_config.json

Performance

Response Metrics

Response Time< 200ms

ThroughputMedium

Resource Usage

Memory UsageLow

CPU UsageLow

How to Set Up and Use Headroom

Headroom is a context compression library, proxy, and MCP server that reduces the token count of tool outputs, logs, files, and RAG chunks by 60-95% before they reach an LLM, while preserving answer quality. It works in three deployment modes: as a Python/TypeScript library you call inline, as a drop-in HTTP proxy that intercepts requests to any LLM API, or as an MCP server exposing headroom_compress, headroom_retrieve, and headroom_stats tools. Engineering teams and AI agent developers use Headroom to reduce costs, fit more context into limited windows, and speed up inference on large codebases or log-heavy workflows.

Prerequisites

Python 3.10+ for the Python library or MCP server mode
Node.js 18+ for the TypeScript/npm package
Optional: Apple Silicon Mac for GPU-accelerated memory embedder (pytorch_mps)
Optional: HuggingFace model access for local embedding models (ONNX Runtime)
An MCP client such as Claude Desktop or Claude Code to use the MCP server mode

Install Headroom

Install the Python package with all optional extras for full MCP and compression functionality. Requires Python 3.10 or higher. For TypeScript projects, install the npm package instead.

# Python (full install)
pip install "headroom-ai[all]"

# Python with pipx
pipx install --python python3.13 "headroom-ai[all]"

# TypeScript/Node
npm install headroom-ai

Quick start: wrap your AI CLI tool

The fastest way to see Headroom working is to wrap an existing AI CLI tool. Headroom intercepts the context, compresses it, and forwards it transparently.

headroom wrap claude
headroom perf  # view compression savings after a session

Install the MCP server

Run the built-in install command to register Headroom as an MCP server with your configured MCP clients. This exposes the headroom_compress, headroom_retrieve, and headroom_stats tools.

headroom mcp install

Configure your MCP client manually (alternative)

If you prefer manual configuration, add the Headroom MCP server directly to your Claude Desktop or other MCP client config file.

{
  "mcpServers": {
    "headroom": {
      "command": "headroom",
      "args": ["mcp"],
      "env": {
        "HEADROOM_EMBEDDER_RUNTIME": "pytorch_mps"
      }
    }
  }
}

Use Headroom as a proxy for zero-code integration

Start Headroom in proxy mode to intercept and compress requests from any application that targets an LLM API. Point your app at the proxy port instead of the real API endpoint.

headroom proxy --port 8787
# Then point your app at http://localhost:8787 instead of api.anthropic.com

Use Headroom in Python code

For library-mode integration, import compress() and call it on your message list before sending to an LLM. It returns a compressed version of the messages array.

from headroom import compress

# Compress messages before sending to Claude
compressed = await compress(messages, model='claude-3-5-sonnet')
response = client.messages.create(model='claude-3-5-sonnet', messages=compressed)

Headroom Examples

Client configuration

Claude Desktop configuration to use Headroom as an MCP server, enabling the compression tools in your AI sessions.

{
  "mcpServers": {
    "headroom": {
      "command": "headroom",
      "args": ["mcp"],
      "env": {
        "HEADROOM_EMBEDDER_RUNTIME": "pytorch_mps"
      }
    }
  }
}

Prompts to try

Example requests to make once the Headroom MCP server is connected to your AI client.

- "Compress this 50,000 line log file before analyzing it for errors"
- "Use headroom_compress to reduce this JSON tool output and then summarize it"
- "Show me the headroom_stats for this session — how many tokens have been saved?"
- "Retrieve the most relevant sections from this large document about the error I'm seeing"

Troubleshooting Headroom

Installation fails with Python version error

Headroom requires Python 3.10 or higher. Check your version with 'python3 --version' and upgrade if needed, or use pyenv to manage multiple versions.

ONNX Runtime not found when using embedding-based compression

Set ORT_STRATEGY=system and ORT_LIB_LOCATION to the path of your onnxruntime shared library. Alternatively, install with 'pip install headroom-ai[ml]' which pulls in the correct ONNX Runtime wheel for your platform.

Proxy mode does not intercept requests from my application

Ensure your application's API base URL is set to http://localhost:8787 (or the custom port you specified). The proxy expects standard OpenAI-compatible API request format.

Frequently Asked Questions about Headroom

What is Headroom?

Headroom is a Model Context Protocol (MCP) server that compress tool outputs, logs, files, and rag chunks before they reach the llm. 60-95% fewer tokens, same answers. library, proxy, mcp server. It connects AI assistants to external tools and data sources through a standardized interface.

How do I install Headroom?

Follow the installation instructions on the Headroom GitHub repository. Clone the repo, install dependencies, and add the server config to your AI client.

Which AI clients work with Headroom?

Headroom works with all major MCP-compatible AI clients including Claude Desktop, Claude Code, Cursor, VS Code (GitHub Copilot), Windsurf, and Cline.

Is Headroom free to use?

Yes, Headroom is open source and available under the Apache-2.0 license. You can use it freely in both personal and commercial projects.

Learn More About MCP Servers

Getting Started with MCP

Set up your first MCP server in minutes

MCP Setup Guide

Configure MCP in Claude, Cursor & VS Code

All MCP Tutorials

18+ hands-on guides for developers

MCP FAQ

40+ answers about Model Context Protocol

Headroom Alternatives — Similar Knowledge & Memory Servers

Looking for alternatives to Headroom? Here are other popular knowledge & memory servers you can use with Claude, Cursor, and VS Code.

MemPalace

★ 52.6k

A local AI memory system that stores all conversations verbatim and organizes them into navigable structures. It provides 19 MCP tools for AI assistants to search and retrieve past decisions, debugging sessions, and architecture debates automatically

Kratos

★ 25.7k

🏛️ Memory System for AI Coding Tools - Never explain your codebase again. MCP server with perfect project isolation, 95.8% context accuracy, and the Four Pillars Framework.

Context Mode

★ 15.4k

An MCP server that preserves LLM context by intercepting large data outputs and returning only concise summaries or relevant sections. It enables efficient sandboxed code execution, file processing, and documentation indexing across multiple programm

Memu

★ 13.7k

Memory for 24/7 proactive agents like OpenClaw.

MemOS

★ 9.3k

MemOS (Memory Operating System) is a memory management operating system designed for AI applications. Its goal is: to enable your AI system to have long-term memory like a human, not only remembering what users have said but also actively invoking, u

Everos

★ 5.4k

Build, evaluate, and integrate long-term memory for self-evolving agents.

Browse More Knowledge & Memory MCP Servers

Explore all knowledge & memory servers available in the MCPgee directory. Each server includes setup guides for Claude, Cursor, and VS Code.

Knowledge & Memory Browse All Servers

Set Up Headroom in Your Editor

Choose your AI client for step-by-step setup instructions.

🖥️

Claude Desktop

macOS & Windows app

⌨️

Claude Code

CLI & terminal

📝

Cursor

AI-first code editor

💻

VS Code

GitHub Copilot MCP

🏄

Windsurf

Codeium AI editor

🔌

Cline

VS Code extension

Quick Config Preview

{
  "mcpServers": {
    "headroom": {
      "command": "npx",
      "args": ["-y", "headroom"]
    }
  }
}

Add this to your claude_desktop_config.json or .cursor/mcp.json

Read the full setup guide →

Ready to use Headroom?

Browse our complete directory of 33,000+ MCP servers, read setup guides for your editor, and start building with the Model Context Protocol.

33,000+ ServersFree & Open SourceStep-by-Step Guides

Explore All Servers Read Our Guides