How do I install Late CLI MCP Server?

Follow the setup instructions on the Late CLI GitHub repository, then add the server configuration to your AI client.

What category is Late CLI MCP Server?

Late CLI is categorized under Coding Agents. Browse more servers in these categories on MCPgee.

Late CLI

Name: Late Cli MCP Server
Author: mlhher

v1.0.0•Coding Agents•stable

Orchestrate an entire AI dev team on 5GB VRAM. Ephemeral subagents, exact-match diffs. Single static binary, any model. Zero config, zero context bloat.

agentai-agentai-coding-assistantautonomous-agentsclaude

312

Stars

Downloads

Weekly

0/5

View on GitHub

What is Late CLI?

Late CLI is a Model Context Protocol (MCP) server that allows AI assistants like Claude, Cursor, and VS Code to orchestrate an entire ai dev team on 5gb vram. ephemeral subagents, exact-match diffs. single static binary, any model. zero config, zero context bloat.

Orchestrate an entire AI dev team on 5GB VRAM. Ephemeral subagents, exact-match diffs. Single static binary, any model. Zero config, zero context bloat.

This server falls under the Coding Agents category on MCPgee, the world's largest MCP server directory with 33,000+ servers.

Features

Orchestrate an entire AI dev team on 5GB VRAM. Ephemeral sub

Use Cases

Orchestrate AI dev team on low VRAM

Ephemeral subagent spawning

Exact-match code diffs

mlhher

Maintainer

LicenseNOASSERTION

Languagego

Versionv1.0.0

UpdatedMay 21, 2026

Statushealthy

Maintenanceactive

Works with

ClaudeOpenAIwindowsmacoslinux

View Source Browse All Servers

Installation

Manual Installation

npx late-cli

Configuration

Configuration Details

Config File

claude_desktop_config.json

Performance

Response Metrics

Response Time< 200ms

ThroughputMedium

Resource Usage

Memory UsageLow

CPU UsageLow

How to Set Up and Use Late CLI

Late is a Go-based AI coding CLI that orchestrates an entire AI development team using ephemeral subagents on as little as 5GB VRAM, making it practical for local LLM workflows without cloud APIs. It uses exact-match search/replace diffs with autonomous self-healing, supports hybrid model routing (a reasoning model for orchestration, a fast local model for execution), and integrates natively with MCP servers via standard I/O. Late works with Claude, DeepSeek, Qwen, Gemma, and any OpenAI-compatible API, shipping as a single static binary with zero configuration required for local llama.cpp models.

Prerequisites

macOS, Linux, or Windows (WSL) operating system
For local models: llama.cpp server running on port 8080 (no API key needed)
For cloud models: OPENAI_API_KEY or compatible API key and OPENAI_BASE_URL pointing to your provider
An MCP-compatible server if you want to extend Late with external tools (optional)
Homebrew installed for the recommended macOS/Linux installation method

Install Late

Install using Homebrew on macOS or Linux. Alternative methods include the universal install script for other systems.

brew tap mlhher/late && brew install late

Alternative: universal install script

If Homebrew is not available, use the universal installer which supports Linux, macOS, and Windows WSL.

curl -sfL https://raw.githubusercontent.com/mlhher/late-cli/main/install.sh | bash

Configure environment variables for cloud models

For cloud-hosted models, set the API credentials. For local llama.cpp running on port 8080, no configuration is needed — Late connects automatically.

export OPENAI_BASE_URL=https://api.anthropic.com/v1
export OPENAI_API_KEY=your_api_key_here
export OPENAI_MODEL=claude-3-5-sonnet-20241022

Start a coding session

Run Late in your project directory. It will read your codebase context and be ready to spawn subagents for individual tasks.

late

Integrate an MCP server

Add an external MCP server to Late by pointing it to the server's stdio command. This maps MCP tools directly into Late's agent capabilities.

late --mcp-server "npx -y some-mcp-server"

Use hybrid model routing

Configure Late to use a powerful reasoning model for orchestration while using a fast local model for code execution subagents, reducing cost and latency.

late --orchestrator-model claude-3-5-sonnet-20241022 \
     --executor-model qwen2.5-coder:7b

Late CLI Examples

Client configuration

MCP client configuration to connect Claude Desktop to Late CLI as an MCP server.

{
  "mcpServers": {
    "late-cli": {
      "command": "npx",
      "args": ["late-cli"],
      "env": {
        "OPENAI_API_KEY": "your_api_key_here",
        "OPENAI_BASE_URL": "https://api.openai.com/v1",
        "OPENAI_MODEL": "gpt-4o"
      }
    }
  }
}

Prompts to try

Example prompts for AI-assisted coding tasks using Late CLI.

- "Refactor the authentication module to use JWT instead of session cookies"
- "Add unit tests for all public methods in src/utils.go"
- "Find and fix all TypeScript type errors in the frontend directory"
- "Implement the TODO items in api/handlers.go and write corresponding tests"

Troubleshooting Late CLI

Late cannot connect to a local model

Ensure your llama.cpp server is running on port 8080 with 'llama-server -m your-model.gguf --port 8080'. Late defaults to localhost:8080 for local models. If you changed the port, set OPENAI_BASE_URL=http://localhost:YOUR_PORT/v1.

Exact-match diffs fail and self-healing loops indefinitely

Self-healing triggers when the search string does not match current file content, often due to trailing whitespace or line-ending differences. Run 'late --dry-run' to preview proposed changes before applying. If the issue persists, ensure your editor is not reformatting files in ways that diverge from the model's view.

Homebrew tap not found or install fails

Try the universal install script as an alternative: 'curl -sfL https://raw.githubusercontent.com/mlhher/late-cli/main/install.sh | bash'. On Arch Linux, use 'yay -S late-cli-bin'. You can also download binaries directly from the GitHub releases page.

Frequently Asked Questions about Late CLI

What is Late CLI?

Late CLI is a Model Context Protocol (MCP) server that orchestrate an entire ai dev team on 5gb vram. ephemeral subagents, exact-match diffs. single static binary, any model. zero config, zero context bloat. It connects AI assistants to external tools and data sources through a standardized interface.

How do I install Late CLI?

Follow the installation instructions on the Late CLI GitHub repository. Clone the repo, install dependencies, and add the server config to your AI client.

Which AI clients work with Late CLI?

Late CLI works with all major MCP-compatible AI clients including Claude Desktop, Claude Code, Cursor, VS Code (GitHub Copilot), Windsurf, and Cline.

Is Late CLI free to use?

Yes, Late CLI is open source and available under the NOASSERTION license. You can use it freely in both personal and commercial projects.

Learn More About MCP Servers

Getting Started with MCP

Set up your first MCP server in minutes

MCP Setup Guide

Configure MCP in Claude, Cursor & VS Code

All MCP Tutorials

18+ hands-on guides for developers

MCP FAQ

40+ answers about Model Context Protocol

Late CLI Alternatives — Similar Coding Agents Servers

Looking for alternatives to Late CLI? Here are other popular coding agents servers you can use with Claude, Cursor, and VS Code.

Dify

★ 142.2k

Production-ready platform for agentic workflow development.

Ruflo

★ 54.0k

🌊 The leading agent orchestration platform for Claude. Deploy intelligent multi-agent swarms, coordinate autonomous workflows, and build conversational AI systems. Features enterprise-grade architecture, self-learning swarm intelligence, RAG integrat

Goose

★ 45.7k

an open source, extensible AI agent that goes beyond code suggestions - install, execute, edit, and test with any LLM

Antigravity Awesome Skills

★ 38.3k

Installable GitHub library of 1,400+ agentic skills for Claude Code, Cursor, Codex CLI, Gemini CLI, Antigravity, and more. Includes installer CLI, bundles, workflows, and official/community skill collections.

AgentScope

★ 25.5k

Build and run agents you can see, understand and trust.

Serena

★ 24.5k

A coding agent toolkit that provides IDE-like semantic code retrieval and editing tools, enabling LLMs to efficiently navigate and modify codebases using symbol-level operations instead of basic file reading and string replacements.

Browse More Coding Agents MCP Servers

Explore all coding agents servers available in the MCPgee directory. Each server includes setup guides for Claude, Cursor, and VS Code.

Coding Agents Browse All Servers

Set Up Late CLI in Your Editor

Choose your AI client for step-by-step setup instructions.

🖥️

Claude Desktop

macOS & Windows app

⌨️

Claude Code

CLI & terminal

📝

Cursor

AI-first code editor

💻

VS Code

GitHub Copilot MCP

🏄

Windsurf

Codeium AI editor

🔌

Cline

VS Code extension

Quick Config Preview

{
  "mcpServers": {
    "late-cli": {
      "command": "npx",
      "args": ["-y", "late-cli"]
    }
  }
}

Add this to your claude_desktop_config.json or .cursor/mcp.json

Read the full setup guide →

Ready to use Late CLI?

Browse our complete directory of 33,000+ MCP servers, read setup guides for your editor, and start building with the Model Context Protocol.

33,000+ ServersFree & Open SourceStep-by-Step Guides

Explore All Servers Read Our Guides