MCP Data

v1.0.0Databasesstable

Duckdb-based MCP server for datasets

mcp-datamcpai-integration
Share:
21
Stars
0
Downloads
0
Weekly
0/5

What is MCP Data?

MCP Data is a Model Context Protocol (MCP) server that allows AI assistants like Claude, Cursor, and VS Code to duckdb-based mcp server for datasets

Duckdb-based MCP server for datasets

This server falls under the Databases and Data Science & ML categories on MCPgee, the world's largest MCP server directory with 33,000+ servers.

Features

  • Duckdb-based MCP server for datasets

Use Cases

Query datasets using DuckDB through MCP.
Perform data analysis and manipulation without raw SQL.
boettiger-lab

Maintainer

LicenseMIT
Languagepython
Versionv1.0.0
UpdatedMay 16, 2026
Statushealthy
Maintenanceactive

Works with

ClaudeOpenAIwindowsmacoslinux

Installation

Manual Installation

npx mcp-data

Configuration

Configuration Details

Config File

claude_desktop_config.json

Performance

Response Metrics

Response Time< 200ms
ThroughputMedium

Resource Usage

Memory UsageLow
CPU UsageLow

How to Set Up and Use MCP Data

MCP Data Server is a DuckDB-backed MCP server that gives AI assistants direct, SQL-free access to large geospatial and scientific datasets stored as cloud-native Parquet files on S3. It integrates with a STAC (SpatioTemporal Asset Catalog) to allow discovery of available datasets, fetching of schema information, and execution of analytical DuckDB queries over petabyte-scale data without downloading files locally. Data scientists and researchers working with remote sensing, ecology, or earth-observation data can use it to run ad-hoc analyses via natural language inside Claude or Claude Code.

Prerequisites

  • Access to the hosted service at https://duckdb-mcp.nrp-nautilus.io/mcp or a local Python 3 environment for self-hosting
  • Claude Code CLI or another MCP-compatible client that supports HTTP transport
  • pip and Python 3 (only needed for local self-hosted deployment)
1

Add the hosted server to Claude Code (quickest start)

The server is available as a public hosted endpoint. Add it to your Claude Code project with a single command — no local install required.

claude mcp add --transport http duckdb-geo https://duckdb-mcp.nrp-nautilus.io/mcp
2

Make it available across all projects (optional)

To use the server in every Claude Code session rather than just the current project, add the --scope user flag.

claude mcp add --scope user --transport http duckdb-geo https://duckdb-mcp.nrp-nautilus.io/mcp
3

Alternative: self-host the server locally

Clone the repository and install dependencies to run your own instance pointing to private S3 datasets.

git clone https://github.com/boettiger-lab/mcp-data-server.git
cd mcp-data-server
pip install -r requirements.txt
python server.py
4

Add VS Code / Claude Desktop HTTP configuration

For IDE or Claude Desktop access, add the server endpoint to your mcp.json or claude_desktop_config.json.

MCP Data Examples

Client configuration

VS Code .vscode/mcp.json configuration that connects to the hosted DuckDB geospatial MCP endpoint over HTTP.

{
  "servers": {
    "duckdb-geo": {
      "url": "https://duckdb-mcp.nrp-nautilus.io/mcp"
    }
  }
}

Prompts to try

Example prompts that use the browse, schema, and query tools to explore and analyze geospatial datasets.

- "List all available datasets in the catalog"
- "What is the schema for the landsat-c2l2-sr dataset?"
- "Query the GBIF species occurrence dataset to count observations by country for 2023"
- "Find the top 10 locations with the highest NDVI values in the Sentinel-2 dataset for California"
- "Run a DuckDB SQL query to aggregate monthly precipitation totals from the ERA5 climate dataset"

Troubleshooting MCP Data

Connection refused or timeout reaching the hosted endpoint

The hosted server at duckdb-mcp.nrp-nautilus.io requires internet access. Check your network connection and firewall. For offline use, self-host by running python server.py locally and point the URL to http://localhost:8000/mcp.

DuckDB query returns an error about missing S3 credentials

Public datasets on the hosted server do not require credentials. For private S3 buckets in a self-hosted deployment, set AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY in the server environment before starting python server.py.

Query runs slowly on large datasets

Increase the THREADS environment variable (default 100) in your local deployment to allow DuckDB to use more CPU cores: export THREADS=200 before starting server.py.

Frequently Asked Questions about MCP Data

What is MCP Data?

MCP Data is a Model Context Protocol (MCP) server that duckdb-based mcp server for datasets It connects AI assistants to external tools and data sources through a standardized interface.

How do I install MCP Data?

Follow the installation instructions on the MCP Data GitHub repository. Clone the repo, install dependencies, and add the server config to your AI client.

Which AI clients work with MCP Data?

MCP Data works with all major MCP-compatible AI clients including Claude Desktop, Claude Code, Cursor, VS Code (GitHub Copilot), Windsurf, and Cline.

Is MCP Data free to use?

Yes, MCP Data is open source and available under the MIT license. You can use it freely in both personal and commercial projects.

Browse More Databases MCP Servers

Explore all databases servers available in the MCPgee directory. Each server includes setup guides for Claude, Cursor, and VS Code.

Quick Config Preview

{ "mcpServers": { "mcp-data": { "command": "npx", "args": ["-y", "mcp-data"] } } }

Add this to your claude_desktop_config.json or .cursor/mcp.json

Read the full setup guide →

Ready to use MCP Data?

Browse our complete directory of 33,000+ MCP servers, read setup guides for your editor, and start building with the Model Context Protocol.

33,000+ ServersFree & Open SourceStep-by-Step Guides