Klee
A native macOS AI chat app powered by MLX. 100% local inference on Apple Silicon, no cloud required. Built with ShipSwift.
What is Klee?
Klee is a Model Context Protocol (MCP) server that allows AI assistants like Claude, Cursor, and VS Code to native macos ai chat app powered by mlx. 100% local inference on apple silicon, no cloud required. built with shipswift.
A native macOS AI chat app powered by MLX. 100% local inference on Apple Silicon, no cloud required. Built with ShipSwift.
This server falls under the Developer Tools category on MCPgee, the world's largest MCP server directory with 33,000+ servers.
Features
- A native macOS AI chat app powered by MLX. 100% local infere
Use Cases
Maintainer
Works with
Installation
Manual Installation
npx kleeConfiguration
Configuration Details
claude_desktop_config.json
Performance
Response Metrics
Resource Usage
How to Set Up and Use Klee
Klee is a native macOS AI agent application built with SwiftUI that runs large language models entirely on Apple Silicon hardware using the MLX framework, requiring no internet connection, cloud account, or subscription. It provides a chat interface with built-in tool calling for file operations, web search (via optional Jina API), and shell command execution, supporting eight quantized model sizes from 4.3 GB to 70 GB depending on available RAM. Privacy-conscious developers and researchers use Klee when they want the convenience of a polished AI chat app without sending any data to external servers.
Prerequisites
- macOS 14 (Sonoma) or later on an Apple Silicon Mac (M1, M2, M3, or M4)
- At least 8 GB of unified memory (16 GB recommended for 7B+ models; 64 GB+ for the 70B model)
- Sufficient disk space for model downloads (4.3 GB minimum per model, up to 40+ GB for large models)
- Optional: a free Jina AI API key (jina.ai) for web search tool access
Download Klee
Go to github.com/signerlabs/Klee/releases and download the latest .dmg file for macOS.
Install Klee
Open the downloaded .dmg file and drag the Klee app to your Applications folder. On first launch, macOS Gatekeeper may show a warning—dismiss it via System Settings → Privacy & Security → Open Anyway.
Download a language model
Launch Klee. The app auto-detects your Mac's unified memory and shows compatible quantized models for download. Select a model appropriate for your RAM (e.g., a 7B model for 8 GB RAM, a 13B for 16 GB). Models are cached at ~/.klee/models/.
Enable optional web search
To enable the built-in web search tool, open Klee's sidebar settings, toggle Web Search on, and paste your Jina AI API key. A free tier is available at jina.ai with no credit card required.
Start chatting with local inference
Type a message in the chat input. Klee runs inference entirely on-device via MLX, streaming tokens in real time. The model can call built-in tools (file read/write/list/delete, web search, shell execution) autonomously based on your request.
Attach images for vision tasks
Drag and drop an image into the chat input or click the attachment button to use Klee's vision model support. The model will analyze the image as part of its response.
Klee Examples
Example chat session configuration
Klee does not use a text config file; all settings are managed via the GUI. The MCP client configuration below shows how to integrate Klee-like local models in MCP-aware tools if needed.
{
"mcpServers": {
"klee-local": {
"command": "klee",
"args": ["--mcp-server"],
"env": {
"KLEE_MODEL": "mlx-community/Llama-3.2-3B-Instruct-4bit"
}
}
}
}Prompts to try
With Klee running locally on your Mac, these prompts exercise its built-in tools and vision capabilities.
- "Read the file ~/Desktop/report.pdf and give me a three-bullet summary"
- "List all Python files in ~/projects/myapp and find any files over 500 lines"
- "Run 'git log --oneline -20' in ~/projects/myapp and explain the recent changes"
- "Search the web for the latest MLX release notes and summarize what's new"
- "Look at this screenshot and describe what errors are shown in the terminal output"Troubleshooting Klee
Klee crashes or is very slow after loading a model
The selected model likely exceeds your Mac's unified memory. Open Activity Monitor and check memory pressure. Try a smaller quantized model—for 8 GB RAM, use a 4-bit quantized 3B or 4B model rather than a 7B model.
macOS blocks Klee from opening with 'unidentified developer' warning
Go to System Settings → Privacy & Security, scroll to the Security section, and click 'Open Anyway' next to the Klee entry. You only need to do this once after first download.
Web search tool returns no results or fails
Ensure you have entered a valid Jina AI API key in Klee's sidebar settings. Check that the key has remaining quota at jina.ai. Web search requires an internet connection even though model inference is fully local.
Frequently Asked Questions about Klee
What is Klee?
Klee is a Model Context Protocol (MCP) server that native macos ai chat app powered by mlx. 100% local inference on apple silicon, no cloud required. built with shipswift. It connects AI assistants to external tools and data sources through a standardized interface.
How do I install Klee?
Follow the installation instructions on the Klee GitHub repository. Clone the repo, install dependencies, and add the server config to your AI client.
Which AI clients work with Klee?
Klee works with all major MCP-compatible AI clients including Claude Desktop, Claude Code, Cursor, VS Code (GitHub Copilot), Windsurf, and Cline.
Is Klee free to use?
Yes, Klee is open source and available under the MIT license. You can use it freely in both personal and commercial projects.
Klee Alternatives — Similar Developer Tools Servers
Looking for alternatives to Klee? Here are other popular developer tools servers you can use with Claude, Cursor, and VS Code.
Ecc
★ 188.2kThe agent harness performance optimization system. Skills, instincts, memory, security, and research-first development for Claude Code, Codex, Opencode, Cursor and beyond.
Javaguide
★ 155.8kJava 面试 & 后端通用面试指南,覆盖计算机基础、数据库、分布式、高并发、系统设计与 AI 应用开发
Gemini CLI
★ 104.5kA secure MCP server that wraps the Google Gemini CLI, allowing clients to query Gemini models using local OAuth sessions without requiring an API key. It provides tools for model interaction and diagnostics with built-in protection against command in
Awesome MCP Servers
★ 87.3k⭐ Curated list of Model Context Protocol (MCP) servers - tools that extend Claude Desktop, Cursor, Windsurf, and other MCP clients with custom capabilities.
MCP Servers
★ 86.0kModel Context Protocol Servers
CC Switch
★ 77.5kA cross-platform desktop All-in-One assistant for Claude Code, Codex, OpenCode, OpenClaw, Gemini CLI & Hermes Agent. Only official website: ccswitch.io
Browse More Developer Tools MCP Servers
Explore all developer tools servers available in the MCPgee directory. Each server includes setup guides for Claude, Cursor, and VS Code.
Set Up Klee in Your Editor
Choose your AI client for step-by-step setup instructions.
Quick Config Preview
Add this to your claude_desktop_config.json or .cursor/mcp.json
Ready to use Klee?
Browse our complete directory of 33,000+ MCP servers, read setup guides for your editor, and start building with the Model Context Protocol.