How do I install Houtini MCP Server?

Follow the setup instructions on the Houtini GitHub repository, then add the server configuration to your AI client.

What category is Houtini MCP Server?

Houtini is categorized under Coding Agents. Browse more servers in these categories on MCPgee.

Houtini

Name: Houtini-lm
Author: houtini-ai

v1.0.0•Coding Agents•stable

MCP server that saves Claude Code tokens by delegating bounded tasks to local or cloud LLMs. Works with LM Studio, Ollama, vLLM, DeepSeek, Groq, Cerebras.

ai-agentsclaudeclaude-mcpcode-generationdeveloper-tool

Stars

Downloads

Weekly

0/5

View on GitHub

What is Houtini?

Houtini is a Model Context Protocol (MCP) server that allows AI assistants like Claude, Cursor, and VS Code to mcp server that saves claude code tokens by delegating bounded tasks to local or cloud llms. works with lm studio, ollama, vllm, deepseek, groq, cerebras.

MCP server that saves Claude Code tokens by delegating bounded tasks to local or cloud LLMs. Works with LM Studio, Ollama, vLLM, DeepSeek, Groq, Cerebras.

This server falls under the Coding Agents category on MCPgee, the world's largest MCP server directory with 33,000+ servers.

Features

MCP server that saves Claude Code tokens by delegating bound

Use Cases

Delegate bounded coding tasks to local or cloud LLMs to save tokens.

Integrate with LM Studio, Ollama, vLLM, or Groq for inference.

Optimize token usage by offloading work to specialized models.

houtini-ai

Maintainer

LicenseMIT License

Languagejavascript

Versionv1.0.0

UpdatedMay 21, 2026

Statushealthy

Maintenanceactive

Works with

ClaudeOpenAIwindowsmacoslinux

View Source Browse All Servers

Installation

Manual Installation

npx houtini-lm

Configuration

Configuration Details

Config File

claude_desktop_config.json

Performance

Response Metrics

Response Time< 200ms

ThroughputMedium

Resource Usage

Memory UsageLow

CPU UsageLow

How to Set Up and Use Houtini

Houtini LM is an MCP server that reduces Claude Code token consumption by delegating bounded, repetitive tasks to local or cloud LLMs running on LM Studio, Ollama, vLLM, llama.cpp, OpenRouter, DeepSeek, Groq, or Cerebras. It exposes tools for chat, code analysis, multi-file code review, text embeddings, and provider discovery — all communicating with any OpenAI-compatible API endpoint. Developers use it to offload boilerplate generation, code review, format conversion, and other grunt work to cheaper or faster local models while keeping Claude focused on architecture and complex reasoning.

Prerequisites

Node.js 18+ installed
A running OpenAI-compatible LLM backend: LM Studio, Ollama, vLLM, llama.cpp (local), or an API key for Groq, DeepSeek, Cerebras, or OpenRouter (cloud)
npx available (comes with Node.js)
Claude Code or another MCP-compatible client

Set up your LLM backend

Start your local LLM server (e.g. LM Studio on port 1234, Ollama on port 11434) or obtain an API key from a cloud provider like Groq, DeepSeek, or OpenRouter.

# Example: start LM Studio server on default port
# (done via LM Studio GUI: Local Server > Start Server)

# Example: start Ollama
ollama serve

Add Houtini LM to Claude Code

Use the claude mcp add command to register the server. This is the recommended installation method for Claude Code.

claude mcp add houtini-lm -- npx -y @houtini/lm

Configure environment variables for your provider

Set HOUTINI_LM_ENDPOINT_URL to point at your LLM backend. For cloud providers that require authentication, set HOUTINI_LM_API_KEY. Optionally set HOUTINI_LM_MODEL to pin a specific model.

# For LM Studio (default, often no key needed)
export HOUTINI_LM_ENDPOINT_URL="http://localhost:1234"

# For Groq
export HOUTINI_LM_ENDPOINT_URL="https://api.groq.com/openai/v1"
export HOUTINI_LM_API_KEY="gsk_your_groq_key"
export HOUTINI_LM_MODEL="llama-3.1-8b-instant"

# For Ollama
export HOUTINI_LM_ENDPOINT_URL="http://localhost:11434/v1"

Add to MCP client configuration (alternative method)

If not using claude mcp add, register Houtini LM in your claude_desktop_config.json manually.

{
  "mcpServers": {
    "houtini-lm": {
      "command": "npx",
      "args": ["-y", "@houtini/lm"],
      "env": {
        "HOUTINI_LM_ENDPOINT_URL": "http://localhost:1234",
        "HOUTINI_LM_API_KEY": "",
        "HOUTINI_LM_MODEL": ""
      }
    }
  }
}

Verify the connection with the discover tool

Ask your AI assistant to call the discover tool, which performs a health check and reports the connected model's capabilities and measured performance.

Houtini Examples

Client configuration

Example claude_desktop_config.json for Houtini LM pointing at a local LM Studio instance.

{
  "mcpServers": {
    "houtini-lm": {
      "command": "npx",
      "args": ["-y", "@houtini/lm"],
      "env": {
        "HOUTINI_LM_ENDPOINT_URL": "http://localhost:1234",
        "HOUTINI_LM_MODEL": "lmstudio-community/Meta-Llama-3.1-8B-Instruct-GGUF"
      }
    }
  }
}

Prompts to try

Example prompts for delegating tasks to a local or cloud LLM via Houtini.

- "Use the local LLM to generate boilerplate CRUD functions for a User model in TypeScript"
- "Delegate a code review of this file to the local model and summarize the findings"
- "Use Houtini to convert this Python function to Go"
- "Run the discover tool to check what model is connected and its performance stats"
- "List all models currently available on the local LLM server"
- "Generate embeddings for this text using the local embedding model"

Troubleshooting Houtini

Connection refused to HOUTINI_LM_ENDPOINT_URL

Ensure your LLM backend is running and listening on the configured port. For LM Studio, start the Local Server from the GUI. For Ollama, run 'ollama serve'. Verify with: curl http://localhost:1234/v1/models

Model not found or empty model list

Run the list_models tool to see what is available. For LM Studio, load a model in the GUI before starting the server. For Ollama, pull a model first: ollama pull llama3.1. If HOUTINI_LM_MODEL is set to a model that is not loaded, clear the variable to let Houtini auto-detect.

API key errors with cloud providers (Groq, DeepSeek, OpenRouter)

Set HOUTINI_LM_API_KEY to your provider's API key and HOUTINI_LM_ENDPOINT_URL to the provider's OpenAI-compatible base URL. For Groq: https://api.groq.com/openai/v1. For OpenRouter: https://openrouter.ai/api/v1. For DeepSeek: https://api.deepseek.com/v1.

Frequently Asked Questions about Houtini

What is Houtini?

Houtini is a Model Context Protocol (MCP) server that mcp server that saves claude code tokens by delegating bounded tasks to local or cloud llms. works with lm studio, ollama, vllm, deepseek, groq, cerebras. It connects AI assistants to external tools and data sources through a standardized interface.

How do I install Houtini?

Follow the installation instructions on the Houtini GitHub repository. Clone the repo, install dependencies, and add the server config to your AI client.

Which AI clients work with Houtini?

Houtini works with all major MCP-compatible AI clients including Claude Desktop, Claude Code, Cursor, VS Code (GitHub Copilot), Windsurf, and Cline.

Is Houtini free to use?

Yes, Houtini is open source and available under the MIT License license. You can use it freely in both personal and commercial projects.

Learn More About MCP Servers

Getting Started with MCP

Set up your first MCP server in minutes

MCP Setup Guide

Configure MCP in Claude, Cursor & VS Code

All MCP Tutorials

18+ hands-on guides for developers

MCP FAQ

40+ answers about Model Context Protocol

Houtini Alternatives — Similar Coding Agents Servers

Looking for alternatives to Houtini? Here are other popular coding agents servers you can use with Claude, Cursor, and VS Code.

Dify

★ 142.2k

Production-ready platform for agentic workflow development.

Ruflo

★ 54.0k

🌊 The leading agent orchestration platform for Claude. Deploy intelligent multi-agent swarms, coordinate autonomous workflows, and build conversational AI systems. Features enterprise-grade architecture, self-learning swarm intelligence, RAG integrat

Goose

★ 45.7k

an open source, extensible AI agent that goes beyond code suggestions - install, execute, edit, and test with any LLM

Antigravity Awesome Skills

★ 38.3k

Installable GitHub library of 1,400+ agentic skills for Claude Code, Cursor, Codex CLI, Gemini CLI, Antigravity, and more. Includes installer CLI, bundles, workflows, and official/community skill collections.

AgentScope

★ 25.5k

Build and run agents you can see, understand and trust.

Serena

★ 24.5k

A coding agent toolkit that provides IDE-like semantic code retrieval and editing tools, enabling LLMs to efficiently navigate and modify codebases using symbol-level operations instead of basic file reading and string replacements.

Browse More Coding Agents MCP Servers

Explore all coding agents servers available in the MCPgee directory. Each server includes setup guides for Claude, Cursor, and VS Code.

Coding Agents Browse All Servers

Set Up Houtini in Your Editor

Choose your AI client for step-by-step setup instructions.

🖥️

Claude Desktop

macOS & Windows app

⌨️

Claude Code

CLI & terminal

📝

Cursor

AI-first code editor

💻

VS Code

GitHub Copilot MCP

🏄

Windsurf

Codeium AI editor

🔌

Cline

VS Code extension

Quick Config Preview

{
  "mcpServers": {
    "houtini-lm": {
      "command": "npx",
      "args": ["-y", "houtini-lm"]
    }
  }
}

Add this to your claude_desktop_config.json or .cursor/mcp.json

Read the full setup guide →

Ready to use Houtini?

Browse our complete directory of 33,000+ MCP servers, read setup guides for your editor, and start building with the Model Context Protocol.

33,000+ ServersFree & Open SourceStep-by-Step Guides

Explore All Servers Read Our Guides