Gemini OCR

v1.0.0Search & Data Extractionstable

A FastMCP-based OCR server powered by Google Gemini. Handles both image file paths and base64‑encoded images to return extracted text. Easy to integrate into MCP workflows — just set your Gemini API key and model, run the server, and call 'ocr_image_

gemini-ocrmcpai-integration
Share:
5
Stars
0
Downloads
0
Weekly
0/5

What is Gemini OCR?

Gemini OCR is a Model Context Protocol (MCP) server that allows AI assistants like Claude, Cursor, and VS Code to fastmcp-based ocr server powered by google gemini. handles both image file paths and base64‑encoded images to return extracted text. easy to integrate into mcp workflows — just set your gemini api key...

A FastMCP-based OCR server powered by Google Gemini. Handles both image file paths and base64‑encoded images to return extracted text. Easy to integrate into MCP workflows — just set your Gemini API key and model, run the server, and call 'ocr_image_

This server falls under the Search & Data Extraction category on MCPgee, the world's largest MCP server directory with 33,000+ servers.

Features

  • A FastMCP-based OCR server powered by Google Gemini. Handles

Use Cases

Extract text from images using Google Gemini OCR.
Process image files and base64-encoded data for text extraction.
WindoC

Maintainer

LicenseMIT
Languagepython
Versionv1.0.0
UpdatedApr 28, 2026
Statushealthy
Maintenanceactive

Works with

ClaudeOpenAIwindowsmacoslinux

Installation

Manual Installation

npx gemini-ocr

Configuration

Configuration Details

Config File

claude_desktop_config.json

Performance

Response Metrics

Response Time< 200ms
ThroughputMedium

Resource Usage

Memory UsageLow
CPU UsageLow

Frequently Asked Questions about Gemini OCR

What is Gemini OCR?

Gemini OCR is a Model Context Protocol (MCP) server that fastmcp-based ocr server powered by google gemini. handles both image file paths and base64‑encoded images to return extracted text. easy to integrate into mcp workflows — just set your gemini api key and model, run the server, and call 'ocr_image_ It connects AI assistants to external tools and data sources through a standardized interface.

How do I install Gemini OCR?

Follow the installation instructions on the Gemini OCR GitHub repository. Clone the repo, install dependencies, and add the server config to your AI client.

Which AI clients work with Gemini OCR?

Gemini OCR works with all major MCP-compatible AI clients including Claude Desktop, Claude Code, Cursor, VS Code (GitHub Copilot), Windsurf, and Cline.

Is Gemini OCR free to use?

Yes, Gemini OCR is open source and available under the MIT license. You can use it freely in both personal and commercial projects.

Browse More Search & Data Extraction MCP Servers

Explore all search & data extraction servers available in the MCPgee directory. Each server includes setup guides for Claude, Cursor, and VS Code.

Quick Config Preview

{ "mcpServers": { "gemini-ocr": { "command": "npx", "args": ["-y", "gemini-ocr"] } } }

Add this to your claude_desktop_config.json or .cursor/mcp.json

Read the full setup guide →

Ready to use Gemini OCR?

Browse our complete directory of 33,000+ MCP servers, read setup guides for your editor, and start building with the Model Context Protocol.

33,000+ ServersFree & Open SourceStep-by-Step Guides