DataHub

v1.0.0Databasesstable

The official Model Context Protocol (MCP) server for DataHub (https://datahub.com)

datahubmcpai-integration
Share:
74
Stars
0
Downloads
0
Weekly
0/5

What is DataHub?

DataHub is a Model Context Protocol (MCP) server that allows AI assistants like Claude, Cursor, and VS Code to official model context protocol (mcp) server for datahub (https://datahub.com)

The official Model Context Protocol (MCP) server for DataHub (https://datahub.com)

This server falls under the Databases category on MCPgee, the world's largest MCP server directory with 33,000+ servers.

Features

  • The official Model Context Protocol (MCP) server for DataHub

Use Cases

Access DataHub data catalog through official MCP server.
acryldata

Maintainer

LicenseApache-2.0
Languagepython
Versionv1.0.0
UpdatedMay 15, 2026
Statushealthy
Maintenanceactive

Works with

ClaudeOpenAIwindowsmacoslinux

Installation

NPM

npx -y datahub

Manual Installation

npx -y datahub

Configuration

Configuration Details

Config File

claude_desktop_config.json

Performance

Response Metrics

Response Time< 200ms
ThroughputMedium

Resource Usage

Memory UsageLow
CPU UsageLow

How to Set Up and Use DataHub

The official DataHub MCP Server connects AI assistants to the DataHub data catalog, giving them the ability to search datasets by keyword, explore schema fields, trace upstream and downstream data lineage, retrieve real production SQL queries, and manage metadata such as tags, terms, owners, and descriptions. It supports both self-hosted DataHub instances and DataHub Cloud, and exposes opt-in mutation tools for teams that want AI to help keep the catalog accurate. Data engineers and analysts use it to answer questions about their data estate and automate catalog maintenance directly from their AI assistant.

Prerequisites

  • DataHub instance (self-hosted at http://localhost:8080 or DataHub Cloud tenant) with API access
  • A DataHub personal access token generated from your DataHub Settings → Access Tokens page
  • uv and uvx installed (curl -LsSf https://astral.sh/uv/install.sh | sh)
  • An MCP-compatible client such as Claude Desktop or Claude Code
1

Install uv and locate uvx

The DataHub MCP server is distributed as a Python package run via uvx. Install uv first, then note the full path to uvx for the configuration step.

curl -LsSf https://astral.sh/uv/install.sh | sh
which uvx
2

Generate a DataHub access token

Log into your DataHub instance, navigate to Settings → Access Tokens, and create a new personal access token. Copy the token value — you will need it in the next step.

3

Configure Claude Desktop for self-hosted DataHub

Add the server to claude_desktop_config.json using the full uvx path and your DataHub URL and token.

{
  "mcpServers": {
    "datahub": {
      "command": "/full/path/to/uvx",
      "args": ["mcp-server-datahub@latest"],
      "env": {
        "DATAHUB_GMS_URL": "http://localhost:8080",
        "DATAHUB_GMS_TOKEN": "your_datahub_token"
      }
    }
  }
}
4

Enable optional mutation tools (if needed)

By default, metadata modifications are disabled. Set TOOLS_IS_MUTATION_ENABLED=true in the env block to allow the AI to add/remove tags, terms, owners, and update descriptions.

{
  "mcpServers": {
    "datahub": {
      "command": "/full/path/to/uvx",
      "args": ["mcp-server-datahub@latest"],
      "env": {
        "DATAHUB_GMS_URL": "http://localhost:8080",
        "DATAHUB_GMS_TOKEN": "your_datahub_token",
        "TOOLS_IS_MUTATION_ENABLED": "true",
        "SEMANTIC_SEARCH_ENABLED": "true"
      }
    }
  }
}
5

Restart Claude Desktop and verify

Quit and relaunch Claude Desktop. Ask it to search for a dataset to confirm the DataHub tools are connected.

DataHub Examples

Client configuration

Standard self-hosted DataHub configuration for claude_desktop_config.json. Replace the uvx path and credentials with your actual values.

{
  "mcpServers": {
    "datahub": {
      "command": "/usr/local/bin/uvx",
      "args": ["mcp-server-datahub@latest"],
      "env": {
        "DATAHUB_GMS_URL": "http://localhost:8080",
        "DATAHUB_GMS_TOKEN": "your_datahub_token",
        "TOOLS_IS_MUTATION_ENABLED": "false",
        "TOOL_RESPONSE_TOKEN_LIMIT": "80000"
      }
    }
  }
}

Prompts to try

Use these prompts to search your data catalog, trace lineage, and manage metadata through the DataHub MCP tools.

- "Search the DataHub catalog for datasets related to 'customer orders' and show me the schema of the top result"
- "Trace the upstream lineage of the dataset urn:li:dataset:(urn:li:dataPlatform:snowflake,orders.fact_orders,PROD) back to its source tables"
- "Find all datasets in the 'marketing' domain and list their owners and tags"
- "Get the production SQL queries that reference the 'user_events' table and summarize what transformations they apply"

Troubleshooting DataHub

Authentication errors when connecting to DataHub

Verify that DATAHUB_GMS_URL points to your GMS endpoint (not the UI URL) and that DATAHUB_GMS_TOKEN is a valid, unexpired personal access token. For DataHub Cloud, the GMS URL is typically https://<tenant>.acryl.io/gms.

uvx command not found in Claude Desktop

Claude Desktop may not inherit your shell PATH. Use the full absolute path returned by 'which uvx' as the command value in the config. Common locations are /usr/local/bin/uvx or ~/.cargo/bin/uvx.

Tool responses are truncated for large datasets with many schema fields

Increase TOOL_RESPONSE_TOKEN_LIMIT (default 80000) or ENTITY_SCHEMA_TOKEN_BUDGET (default 16000) in the env block to allow larger responses, keeping in mind your AI model's context window limits.

Frequently Asked Questions about DataHub

What is DataHub?

DataHub is a Model Context Protocol (MCP) server that official model context protocol (mcp) server for datahub (https://datahub.com) It connects AI assistants to external tools and data sources through a standardized interface.

How do I install DataHub?

Install via npm with the command: npx -y datahub. Then add the server configuration to your AI client's JSON config file (e.g., claude_desktop_config.json or .cursor/mcp.json).

Which AI clients work with DataHub?

DataHub works with all major MCP-compatible AI clients including Claude Desktop, Claude Code, Cursor, VS Code (GitHub Copilot), Windsurf, and Cline.

Is DataHub free to use?

Yes, DataHub is open source and available under the Apache-2.0 license. You can use it freely in both personal and commercial projects.

Browse More Databases MCP Servers

Explore all databases servers available in the MCPgee directory. Each server includes setup guides for Claude, Cursor, and VS Code.

Quick Config Preview

{ "mcpServers": { "datahub": { "command": "npx", "args": ["-y", "datahub"] } } }

Add this to your claude_desktop_config.json or .cursor/mcp.json

Read the full setup guide →

Ready to use DataHub?

Browse our complete directory of 33,000+ MCP servers, read setup guides for your editor, and start building with the Model Context Protocol.

33,000+ ServersFree & Open SourceStep-by-Step Guides