How do I install Lemonade MCP Server?

Follow the setup instructions on the Lemonade GitHub repository, then add the server configuration to your AI client.

What category is Lemonade MCP Server?

Lemonade is categorized under Cloud Services. Browse more servers in these categories on MCPgee.

Lemonade

Name: Lemonade MCP Server
Author: lemonade-sdk

v1.0.0•Cloud Services•stable

Lemonade helps users discover and run local AI apps by serving optimized LLMs right from their own GPUs and NPUs. Join our discord: https://discord.gg/5xXzkMu8Zk

aiamdgenaigpullama

4,084

Stars

Downloads

Weekly

0/5

View on GitHub

What is Lemonade?

Lemonade is a Model Context Protocol (MCP) server that allows AI assistants like Claude, Cursor, and VS Code to lemonade helps users discover and run local ai apps by serving optimized llms right from their own gpus and npus. join our discord: https://discord.gg/5xxzkmu8zk

Lemonade helps users discover and run local AI apps by serving optimized LLMs right from their own GPUs and NPUs. Join our discord: https://discord.gg/5xXzkMu8Zk

This server falls under the Cloud Services category on MCPgee, the world's largest MCP server directory with 33,000+ servers.

Features

Lemonade helps users discover and run local AI apps by servi

Use Cases

Discover and run local LLMs on GPUs and NPUs

Serve optimized LLMs from personal hardware

lemonade-sdk

Maintainer

LicenseApache-2.0

Languagec++

Versionv1.0.0

UpdatedMay 22, 2026

Statushealthy

Maintenanceactive

Works with

ClaudeOpenAIwindowsmacoslinux

View Source Browse All Servers

Installation

Manual Installation

npx lemonade

Configuration

Configuration Details

Config File

claude_desktop_config.json

Performance

Response Metrics

Response Time< 200ms

ThroughputMedium

Resource Usage

Memory UsageLow

CPU UsageLow

How to Set Up and Use Lemonade

Lemonade is a local AI server that runs optimized LLMs, image generation models, and speech models directly on your own GPU or NPU — providing OpenAI and Anthropic-compatible API endpoints at http://localhost:13305 so any tool that calls cloud AI can use your local hardware instead. It supports AMD Ryzen AI NPUs (XDNA2), AMD Radeon GPUs, NVIDIA GPUs (Turing+), Apple Silicon, and x86_64/ARM64 CPUs, with a library of 100+ models spanning chat, coding, image generation (SDXL-Turbo), and speech (Whisper, kokoro). Developers who want free, private, offline AI inference without changing their existing tooling will find Lemonade a drop-in local replacement for cloud API calls.

Prerequisites

Windows 11, macOS, or a supported Linux distro (Ubuntu 24.04+, Fedora 43+, Arch, Debian Trixie+, or Snap-supported distro)
A supported GPU or NPU: AMD Radeon, AMD Ryzen AI (XDNA2), NVIDIA Turing or newer, or Apple Silicon
At least 8 GB RAM (16+ GB recommended for larger models)
An MCP-compatible client such as Claude Desktop
Claude Code CLI installed if using the `lemonade launch claude` integration

Download and install Lemonade Server

Visit the Lemonade download page and pick the installer for your platform. On Windows install the .msi package; on macOS install the .pkg; on Linux use the distro-specific package or Snap.

# Visit for platform-specific installers:
https://lemonade-server.ai/install_options.html

Start the Lemonade daemon

After installation, start the background server daemon. It will listen on port 13305 and expose OpenAI-compatible endpoints.

lemond

Browse and pull a model

Use the lemonade CLI to list available models and download one. Models are stored locally and served from your hardware.

lemonade list
lemonade pull Qwen3.5-35B-A3B-GGUF

Test the local API endpoint

Confirm the server is running by sending a chat completion request to the OpenAI-compatible endpoint. Replace the model name with one you have pulled.

curl http://localhost:13305/api/v1/chat/completions \
  -H 'Content-Type: application/json' \
  -d '{"model": "Qwen3.5-35B-A3B-GGUF", "messages": [{"role": "user", "content": "Hello"}]}'

Launch Claude Code against your local model

Lemonade provides a `launch claude` command that automatically configures Claude Code CLI to use your local Lemonade server instead of Anthropic's cloud.

lemonade launch claude --model Qwen3.5-35B-A3B-GGUF

Add Lemonade as an MCP server in Claude Desktop

Point Claude Desktop at the local Lemonade server by configuring it as an MCP server in claude_desktop_config.json.

{
  "mcpServers": {
    "lemonade": {
      "command": "npx",
      "args": ["lemonade"]
    }
  }
}

Lemonade Examples

Client configuration

Claude Desktop MCP configuration for Lemonade. The server bridges MCP tool calls to the local Lemonade inference engine.

{
  "mcpServers": {
    "lemonade": {
      "command": "npx",
      "args": ["lemonade"]
    }
  }
}

Prompts to try

Prompts to use with Lemonade's local AI capabilities.

- "List the models I currently have downloaded in Lemonade"
- "Generate an image of a sunset over mountains using SDXL-Turbo"
- "Transcribe this audio file using Whisper-Large-v3-Turbo"
- "Chat with Qwen3.5 about my Python code and suggest optimizations"
- "What hardware backends does Lemonade detect on this machine?"

Troubleshooting Lemonade

lemond daemon does not start or port 13305 is already in use

Check if another process is occupying port 13305 with `lsof -i :13305` (macOS/Linux) or `netstat -ano | findstr 13305` (Windows). Kill the conflicting process or configure Lemonade to use a different port via its configuration file.

GPU is not detected and models run slowly on CPU

Run `lemonade backends` to see which hardware backends are active. Install the GPU driver for your hardware (ROCm for AMD, CUDA for NVIDIA, or the Ryzen AI NPU driver package). Reinstall Lemonade after driver installation.

Model pull fails or is very slow

Models are downloaded from the Lemonade model registry. Ensure you have a stable internet connection and sufficient disk space (models range from 2 GB to 70 GB+). Check available space with `df -h` and retry the pull command.

Frequently Asked Questions about Lemonade

What is Lemonade?

Lemonade is a Model Context Protocol (MCP) server that lemonade helps users discover and run local ai apps by serving optimized llms right from their own gpus and npus. join our discord: https://discord.gg/5xxzkmu8zk It connects AI assistants to external tools and data sources through a standardized interface.

How do I install Lemonade?

Follow the installation instructions on the Lemonade GitHub repository. Clone the repo, install dependencies, and add the server config to your AI client.

Which AI clients work with Lemonade?

Lemonade works with all major MCP-compatible AI clients including Claude Desktop, Claude Code, Cursor, VS Code (GitHub Copilot), Windsurf, and Cline.

Is Lemonade free to use?

Yes, Lemonade is open source and available under the Apache-2.0 license. You can use it freely in both personal and commercial projects.

Learn More About MCP Servers

Getting Started with MCP

Set up your first MCP server in minutes

MCP Setup Guide

Configure MCP in Claude, Cursor & VS Code

All MCP Tutorials

18+ hands-on guides for developers

MCP FAQ

40+ answers about Model Context Protocol

Lemonade Alternatives — Similar Cloud Services Servers

Looking for alternatives to Lemonade? Here are other popular cloud services servers you can use with Claude, Cursor, and VS Code.

Open WebUI

★ 138.2k

User-friendly AI Interface (Supports Ollama, OpenAI API, ...)

Anything LLM

★ 60.4k

The all-in-one AI productivity accelerator. On device and privacy first with no annoying setup or configuration.

LocalAI

★ 46.4k

LocalAI is the open-source AI engine. Run any model - LLMs, vision, voice, image, video - on any hardware. No GPU required.

Nacos

★ 33.0k

an easy-to-use dynamic service discovery, configuration and service management platform for building AI cloud native applications.

Xiaozhi ESP32

★ 26.7k

本项目为xiaozhi-esp32提供后端服务，帮助您快速搭建ESP32设备控制服务器。Backend service for xiaozhi-esp32, helps you quickly build an ESP32 device control server.

Gateway

★ 11.8k

A blazing fast AI Gateway with integrated guardrails. Route to 1,600+ LLMs, 50+ AI Guardrails with 1 fast & friendly API.

Browse More Cloud Services MCP Servers

Explore all cloud services servers available in the MCPgee directory. Each server includes setup guides for Claude, Cursor, and VS Code.

Cloud Services Browse All Servers

Set Up Lemonade in Your Editor

Choose your AI client for step-by-step setup instructions.

🖥️

Claude Desktop

macOS & Windows app

⌨️

Claude Code

CLI & terminal

📝

Cursor

AI-first code editor

💻

VS Code

GitHub Copilot MCP

🏄

Windsurf

Codeium AI editor

🔌

Cline

VS Code extension

Quick Config Preview

{
  "mcpServers": {
    "lemonade": {
      "command": "npx",
      "args": ["-y", "lemonade"]
    }
  }
}

Add this to your claude_desktop_config.json or .cursor/mcp.json

Read the full setup guide →

Ready to use Lemonade?

Browse our complete directory of 33,000+ MCP servers, read setup guides for your editor, and start building with the Model Context Protocol.

33,000+ ServersFree & Open SourceStep-by-Step Guides

Explore All Servers Read Our Guides