How do I install SMG LLM Gateway MCP Server?

Follow the setup instructions on the SMG LLM Gateway GitHub repository, then add the server configuration to your AI client.

What category is SMG LLM Gateway MCP Server?

SMG LLM Gateway is categorized under APIs. Browse more servers in these categories on MCPgee.

SMG LLM Gateway

Name: Smg MCP Server
Author: lightseekorg

v1.0.0•APIs•stable

Engine-agnostic LLM gateway in Rust. Full OpenAI & Anthropic API compatibility across SGLang, vLLM, TRT-LLM, OpenAI, Gemini & more. Industry-first gRPC pipeline, KV cache-aware routing, chat history, tokenization caching, Responses API, embeddings, W

anthropicanthropic-apichatclaudegemini

274

Stars

Downloads

Weekly

0/5

View on GitHub

What is SMG LLM Gateway?

SMG LLM Gateway is a Model Context Protocol (MCP) server that allows AI assistants like Claude, Cursor, and VS Code to engine-agnostic llm gateway in rust. full openai & anthropic api compatibility across sglang, vllm, trt-llm, openai, gemini & more. industry-first grpc pipeline, kv cache-aware routing, chat history, ...

This server falls under the APIs category on MCPgee, the world's largest MCP server directory with 33,000+ servers.

Features

Engine-agnostic LLM gateway in Rust. Full OpenAI & Anthropic

Use Cases

Route requests across multiple LLM providers

Cache tokenization and chat history

Support embeddings and inference

lightseekorg

Maintainer

LicenseApache-2.0

Languagerust

Versionv1.0.0

UpdatedMay 21, 2026

Statushealthy

Maintenanceactive

Works with

ClaudeOpenAIwindowsmacoslinux

View Source Browse All Servers

Installation

Manual Installation

npx smg

Configuration

Configuration Details

Config File

claude_desktop_config.json

Performance

Response Metrics

Response Time< 200ms

ThroughputMedium

Resource Usage

Memory UsageLow

CPU UsageLow

How to Set Up and Use SMG LLM Gateway

SMG (Shepherd Model Gateway) is an engine-agnostic LLM gateway written in Rust that provides full OpenAI and Anthropic API compatibility across multiple backends including SGLang, vLLM, TRT-LLM, OpenAI, and Gemini. It features industry-first gRPC pipelines, eight routing policies (including KV cache-aware routing), pluggable chat history storage, tokenization caching, MCP tool execution support, and WASM plugins for custom extensions. Teams use it to unify LLM access across providers, reduce latency through smart routing, and add observability to AI inference workloads.

Prerequisites

Docker installed (for the easiest deployment path), or Rust toolchain for building from source
Access to at least one LLM backend (OpenAI API key, a running vLLM instance, SGLang server, or Gemini credentials)
Python 3.8+ if using the pip install method
An MCP-compatible client to interact with SMG's MCP tool execution endpoint
PostgreSQL, Oracle, or Redis (optional) if using persistent chat history storage

Pull and run the SMG Docker image

The fastest way to run SMG is via Docker. Pull the latest image and start it with your desired worker URL and routing policy.

docker pull lightseekorg/smg:latest
docker run -p 30000:30000 lightseekorg/smg:latest \
  --worker-urls http://your-llm-backend:8000 \
  --policy round_robin

(Alternative) Install via pip

If you prefer a Python install, install SMG via pip and start it from the command line.

pip install smg
smg --worker-urls http://your-llm-backend:8000 --policy cache_aware

(Alternative) Build from source with Cargo

For the best performance, build and install the Rust binary directly using Cargo.

cargo install smg

Configure routing and backends

SMG supports 8 routing policies. Specify multiple worker URLs for load balancing. Use --enable-mesh and --mesh-peer-urls for multi-node deployments.

smg \
  --worker-urls http://backend1:8000 http://backend2:8000 \
  --policy power_of_two \
  --enable-mesh \
  --mesh-advertise-host gateway1.internal

Verify the gateway is running

Send a test request to the OpenAI-compatible chat completions endpoint to confirm SMG is routing correctly.

curl http://localhost:30000/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{"model": "llama3", "messages": [{"role": "user", "content": "Hello!"}]}'

Add SMG as an MCP server

Configure your MCP client to use SMG's MCP tool execution endpoint, pointing the command at the running gateway.

{
  "mcpServers": {
    "smg": {
      "command": "smg",
      "args": ["--worker-urls", "http://localhost:8000", "--policy", "round_robin"],
      "env": {}
    }
  }
}

SMG LLM Gateway Examples

Client configuration

MCP client configuration for using SMG as a gateway to a local LLM backend.

{
  "mcpServers": {
    "smg": {
      "command": "smg",
      "args": [
        "--worker-urls",
        "http://localhost:8000",
        "--policy",
        "cache_aware"
      ],
      "env": {}
    }
  }
}

Prompts to try

Once SMG is running, use these prompts to exercise its routing and gateway capabilities.

- "Route this chat completion request to the least-loaded backend using power_of_two policy"
- "Generate embeddings for this text using the SMG embeddings endpoint"
- "Switch to consistent_hashing routing and explain how it distributes requests"
- "Show the current routing policy and active worker backend health status"

Troubleshooting SMG LLM Gateway

SMG starts but returns 502 or connection refused when calling /v1/chat/completions

Check that the --worker-urls point to a reachable LLM backend. Use `curl http://your-backend:8000/health` to verify the backend is running before starting SMG.

cache_aware routing policy not improving latency

KV cache-aware routing requires backends that report cache statistics. Ensure you are using a compatible inference engine (SGLang or vLLM with cache reporting enabled). Fall back to round_robin for backends that do not support this.

WASM plugins fail to load

WASM plugins must be compiled targeting wasm32-wasi. Verify the plugin file path is correct and accessible. Check SMG logs with RUST_LOG=debug for detailed plugin loading errors.

Frequently Asked Questions about SMG LLM Gateway

What is SMG LLM Gateway?

SMG LLM Gateway is a Model Context Protocol (MCP) server that engine-agnostic llm gateway in rust. full openai & anthropic api compatibility across sglang, vllm, trt-llm, openai, gemini & more. industry-first grpc pipeline, kv cache-aware routing, chat history, tokenization caching, responses api, embeddings, w It connects AI assistants to external tools and data sources through a standardized interface.

How do I install SMG LLM Gateway?

Follow the installation instructions on the SMG LLM Gateway GitHub repository. Clone the repo, install dependencies, and add the server config to your AI client.

Which AI clients work with SMG LLM Gateway?

SMG LLM Gateway works with all major MCP-compatible AI clients including Claude Desktop, Claude Code, Cursor, VS Code (GitHub Copilot), Windsurf, and Cline.

Is SMG LLM Gateway free to use?

Yes, SMG LLM Gateway is open source and available under the Apache-2.0 license. You can use it freely in both personal and commercial projects.

Learn More About MCP Servers

Getting Started with MCP

Set up your first MCP server in minutes

MCP Setup Guide

Configure MCP in Claude, Cursor & VS Code

All MCP Tutorials

18+ hands-on guides for developers

MCP FAQ

40+ answers about Model Context Protocol

SMG LLM Gateway Alternatives — Similar APIs Servers

Looking for alternatives to SMG LLM Gateway? Here are other popular apis servers you can use with Claude, Cursor, and VS Code.

Kong

★ 43.4k

🦍 The API and AI Gateway

API Mega List

★ 5.4k

This GitHub repo is a powerhouse collection of APIs you can start using immediately to build everything from simple automations to full-scale applications. One of the most valuable API lists on GitHub—period. 💪

Fetch

★ 5.4k

Fetch web content and convert to markdown for AI consumption

Fusio

★ 2.1k

Self-Hosted API Management for Builders

Korean Law

★ 1.8k

국가법령정보MCP v4.0 | 법제처 41개 API → 17개 MCP 도구. 법령·판례·조례 검색 + LLM 환각 방지 인용검증 + 조문 영향 그래프(impact_map) + 시점 비교 자동 diff(time_travel) + 시민 5단계 실행 가이드(action_plan) | 41 Korean legal APIs → 17 MCP tools

RuleGo

★ 1.5k

⛓️RuleGo is a lightweight, high-performance, embedded, next-generation component orchestration rule engine framework for Go.

Browse More APIs MCP Servers

Explore all apis servers available in the MCPgee directory. Each server includes setup guides for Claude, Cursor, and VS Code.

APIs Browse All Servers

Set Up SMG LLM Gateway in Your Editor

Choose your AI client for step-by-step setup instructions.

🖥️

Claude Desktop

macOS & Windows app

⌨️

Claude Code

CLI & terminal

📝

Cursor

AI-first code editor

💻

VS Code

GitHub Copilot MCP

🏄

Windsurf

Codeium AI editor

🔌

Cline

VS Code extension

Quick Config Preview

{
  "mcpServers": {
    "smg": {
      "command": "npx",
      "args": ["-y", "smg"]
    }
  }
}

Add this to your claude_desktop_config.json or .cursor/mcp.json

Read the full setup guide →

Ready to use SMG LLM Gateway?

Browse our complete directory of 33,000+ MCP servers, read setup guides for your editor, and start building with the Model Context Protocol.

33,000+ ServersFree & Open SourceStep-by-Step Guides

Explore All Servers Read Our Guides