Luma

v1.3.8Data Science & MLstable

Multi-Model Visual Understanding MCP Server, GLM-4.6V, DeepSeek-OCR (free), and Qwen3-VL-Flash. Provide visual processing capabilities for AI coding models that do not support image understanding.多模型视觉理解MCP服务器,GLM-4.6V、DeepSeek-OCR(免费)和Qwen3-VL-Flash

image-understandingmcpmcp-servermodel-context-protocolvision
Share:
59
Stars
0
Downloads
0
Weekly
0/5

What is Luma?

Luma is a Model Context Protocol (MCP) server that allows AI assistants like Claude, Cursor, and VS Code to multi-model visual understanding mcp server, glm-4.6v, deepseek-ocr (free), and qwen3-vl-flash. provide visual processing capabilities for ai coding models that do not support image understanding.多模型视...

Multi-Model Visual Understanding MCP Server, GLM-4.6V, DeepSeek-OCR (free), and Qwen3-VL-Flash. Provide visual processing capabilities for AI coding models that do not support image understanding.多模型视觉理解MCP服务器,GLM-4.6V、DeepSeek-OCR(免费)和Qwen3-VL-Flash

This server falls under the Data Science & ML category on MCPgee, the world's largest MCP server directory with 33,000+ servers.

Features

  • Multi-Model Visual Understanding MCP Server, GLM-4.6V, DeepS

Use Cases

Perform visual understanding with multiple model options.
Extract text from images using DeepSeek-OCR or other vision models.
Enable image analysis for AI coding models lacking native vision support.
JochenYang

Maintainer

LicenseMIT
Languagetypescript
Versionv1.3.8
UpdatedMay 13, 2026
Statushealthy
Maintenanceactive

Works with

ClaudeOpenAIwindowsmacoslinux

Installation

NPM

npx -y luma-mcp

Manual Installation

npx -y luma-mcp

Configuration

Configuration Details

Config File

claude_desktop_config.json

Performance

Response Metrics

Response Time< 200ms
ThroughputMedium

Resource Usage

Memory UsageLow
CPU UsageLow

How to Set Up and Use Luma

Luma MCP is a multi-model visual understanding server that adds image analysis capabilities to AI coding assistants that lack native vision support. It routes image understanding requests to your choice of visual model — including GLM-4.6V via Zhipu AI, DeepSeek-OCR for free OCR tasks via SiliconFlow, Qwen3-VL-Flash via DashScope, and others — through a single unified `image_understand` tool that accepts local file paths, HTTPS URLs, or base64 data URIs.

Prerequisites

  • Node.js 18 or higher installed
  • An API key for at least one supported vision provider: Zhipu AI (ZHIPU_API_KEY), SiliconFlow (SILICONFLOW_API_KEY for free DeepSeek-OCR), Alibaba DashScope (DASHSCOPE_API_KEY for Qwen), Volcengine, or Hunyuan
  • An MCP-compatible client such as Claude Desktop, Cursor, or Claude Code
1

Choose your vision model provider

Decide which visual model provider to use. SiliconFlow with DeepSeek-OCR is free and good for text extraction. Zhipu GLM-4.6V and Qwen3-VL-Flash are strong general-purpose vision models.

2

Obtain an API key from your chosen provider

Sign up and get an API key: Zhipu AI at https://open.bigmodel.cn, SiliconFlow at https://cloud.siliconflow.cn (free tier available), Alibaba DashScope at https://dashscope.aliyuncs.com.

3

Test the server with npx

Verify the package runs correctly by launching it with your chosen provider's environment variables.

MODEL_PROVIDER=siliconflow SILICONFLOW_API_KEY=your_key npx -y luma-mcp
4

Configure Claude Desktop or your MCP client

Add the server to your client's config file with the appropriate environment variables for your chosen provider.

5

Restart your MCP client

Close and reopen your client to load the new MCP server. The image_understand tool will now be available.

Luma Examples

Client configuration

Example using SiliconFlow with DeepSeek-OCR (free). Swap MODEL_PROVIDER and the API key env var for other providers.

{
  "mcpServers": {
    "luma-mcp": {
      "command": "npx",
      "args": ["-y", "luma-mcp"],
      "env": {
        "MODEL_PROVIDER": "siliconflow",
        "SILICONFLOW_API_KEY": "your_siliconflow_api_key"
      }
    }
  }
}

Prompts to try

Use these prompts in Claude to analyze images via Luma MCP.

- "Use image_understand on ./screenshot.png to describe the UI layout and main components"
- "Extract all text from this image: ./invoice.jpg"
- "Analyze the error in this screenshot and suggest a fix: ./error_screen.png"
- "What accessibility issues do you see in this UI screenshot? https://example.com/design.png"
- "Parse the table data from this image: ./data_table.png"

Troubleshooting Luma

image_understand returns an error about unsupported model or invalid API key

Verify that MODEL_PROVIDER matches exactly one of the supported values: zhipu, siliconflow, qwen, volcengine, hunyuan, or custom. Then confirm the corresponding API key variable is set correctly (e.g., SILICONFLOW_API_KEY for siliconflow).

Local file path images fail to load

Use absolute file paths rather than relative ones. The MCP server process may have a different working directory than your project. For example, use /Users/yourname/project/screenshot.png instead of ./screenshot.png.

Response is truncated for images with a lot of text

Increase the MAX_TOKENS environment variable from its default of 8192. For dense documents, set MAX_TOKENS=16384 or higher, depending on what the chosen model supports.

Frequently Asked Questions about Luma

What is Luma?

Luma is a Model Context Protocol (MCP) server that multi-model visual understanding mcp server, glm-4.6v, deepseek-ocr (free), and qwen3-vl-flash. provide visual processing capabilities for ai coding models that do not support image understanding.多模型视觉理解mcp服务器,glm-4.6v、deepseek-ocr(免费)和qwen3-vl-flash It connects AI assistants to external tools and data sources through a standardized interface.

How do I install Luma?

Install via npm with the command: npx -y luma-mcp. Then add the server configuration to your AI client's JSON config file (e.g., claude_desktop_config.json or .cursor/mcp.json).

Which AI clients work with Luma?

Luma works with all major MCP-compatible AI clients including Claude Desktop, Claude Code, Cursor, VS Code (GitHub Copilot), Windsurf, and Cline.

Is Luma free to use?

Yes, Luma is open source and available under the MIT license. You can use it freely in both personal and commercial projects.

Browse More Data Science & ML MCP Servers

Explore all data science & ml servers available in the MCPgee directory. Each server includes setup guides for Claude, Cursor, and VS Code.

Quick Config Preview

{ "mcpServers": { "luma": { "command": "npx", "args": ["-y", "luma-mcp"] } } }

Add this to your claude_desktop_config.json or .cursor/mcp.json

Read the full setup guide →

Ready to use Luma?

Browse our complete directory of 33,000+ MCP servers, read setup guides for your editor, and start building with the Model Context Protocol.

33,000+ ServersFree & Open SourceStep-by-Step Guides