DataHub
The official Model Context Protocol (MCP) server for DataHub (https://datahub.com)
What is DataHub?
DataHub is a Model Context Protocol (MCP) server that allows AI assistants like Claude, Cursor, and VS Code to official model context protocol (mcp) server for datahub (https://datahub.com)
The official Model Context Protocol (MCP) server for DataHub (https://datahub.com)
This server falls under the Databases category on MCPgee, the world's largest MCP server directory with 33,000+ servers.
Features
- The official Model Context Protocol (MCP) server for DataHub
Use Cases
Maintainer
Works with
Installation
NPM
npx -y datahubManual Installation
npx -y datahubConfiguration
Configuration Details
claude_desktop_config.json
Performance
Response Metrics
Resource Usage
How to Set Up and Use DataHub
The official DataHub MCP Server connects AI assistants to the DataHub data catalog, giving them the ability to search datasets by keyword, explore schema fields, trace upstream and downstream data lineage, retrieve real production SQL queries, and manage metadata such as tags, terms, owners, and descriptions. It supports both self-hosted DataHub instances and DataHub Cloud, and exposes opt-in mutation tools for teams that want AI to help keep the catalog accurate. Data engineers and analysts use it to answer questions about their data estate and automate catalog maintenance directly from their AI assistant.
Prerequisites
- DataHub instance (self-hosted at http://localhost:8080 or DataHub Cloud tenant) with API access
- A DataHub personal access token generated from your DataHub Settings → Access Tokens page
- uv and uvx installed (curl -LsSf https://astral.sh/uv/install.sh | sh)
- An MCP-compatible client such as Claude Desktop or Claude Code
Install uv and locate uvx
The DataHub MCP server is distributed as a Python package run via uvx. Install uv first, then note the full path to uvx for the configuration step.
curl -LsSf https://astral.sh/uv/install.sh | sh
which uvxGenerate a DataHub access token
Log into your DataHub instance, navigate to Settings → Access Tokens, and create a new personal access token. Copy the token value — you will need it in the next step.
Configure Claude Desktop for self-hosted DataHub
Add the server to claude_desktop_config.json using the full uvx path and your DataHub URL and token.
{
"mcpServers": {
"datahub": {
"command": "/full/path/to/uvx",
"args": ["mcp-server-datahub@latest"],
"env": {
"DATAHUB_GMS_URL": "http://localhost:8080",
"DATAHUB_GMS_TOKEN": "your_datahub_token"
}
}
}
}Enable optional mutation tools (if needed)
By default, metadata modifications are disabled. Set TOOLS_IS_MUTATION_ENABLED=true in the env block to allow the AI to add/remove tags, terms, owners, and update descriptions.
{
"mcpServers": {
"datahub": {
"command": "/full/path/to/uvx",
"args": ["mcp-server-datahub@latest"],
"env": {
"DATAHUB_GMS_URL": "http://localhost:8080",
"DATAHUB_GMS_TOKEN": "your_datahub_token",
"TOOLS_IS_MUTATION_ENABLED": "true",
"SEMANTIC_SEARCH_ENABLED": "true"
}
}
}
}Restart Claude Desktop and verify
Quit and relaunch Claude Desktop. Ask it to search for a dataset to confirm the DataHub tools are connected.
DataHub Examples
Client configuration
Standard self-hosted DataHub configuration for claude_desktop_config.json. Replace the uvx path and credentials with your actual values.
{
"mcpServers": {
"datahub": {
"command": "/usr/local/bin/uvx",
"args": ["mcp-server-datahub@latest"],
"env": {
"DATAHUB_GMS_URL": "http://localhost:8080",
"DATAHUB_GMS_TOKEN": "your_datahub_token",
"TOOLS_IS_MUTATION_ENABLED": "false",
"TOOL_RESPONSE_TOKEN_LIMIT": "80000"
}
}
}
}Prompts to try
Use these prompts to search your data catalog, trace lineage, and manage metadata through the DataHub MCP tools.
- "Search the DataHub catalog for datasets related to 'customer orders' and show me the schema of the top result"
- "Trace the upstream lineage of the dataset urn:li:dataset:(urn:li:dataPlatform:snowflake,orders.fact_orders,PROD) back to its source tables"
- "Find all datasets in the 'marketing' domain and list their owners and tags"
- "Get the production SQL queries that reference the 'user_events' table and summarize what transformations they apply"Troubleshooting DataHub
Authentication errors when connecting to DataHub
Verify that DATAHUB_GMS_URL points to your GMS endpoint (not the UI URL) and that DATAHUB_GMS_TOKEN is a valid, unexpired personal access token. For DataHub Cloud, the GMS URL is typically https://<tenant>.acryl.io/gms.
uvx command not found in Claude Desktop
Claude Desktop may not inherit your shell PATH. Use the full absolute path returned by 'which uvx' as the command value in the config. Common locations are /usr/local/bin/uvx or ~/.cargo/bin/uvx.
Tool responses are truncated for large datasets with many schema fields
Increase TOOL_RESPONSE_TOKEN_LIMIT (default 80000) or ENTITY_SCHEMA_TOKEN_BUDGET (default 16000) in the env block to allow larger responses, keeping in mind your AI model's context window limits.
Frequently Asked Questions about DataHub
What is DataHub?
DataHub is a Model Context Protocol (MCP) server that official model context protocol (mcp) server for datahub (https://datahub.com) It connects AI assistants to external tools and data sources through a standardized interface.
How do I install DataHub?
Install via npm with the command: npx -y datahub. Then add the server configuration to your AI client's JSON config file (e.g., claude_desktop_config.json or .cursor/mcp.json).
Which AI clients work with DataHub?
DataHub works with all major MCP-compatible AI clients including Claude Desktop, Claude Code, Cursor, VS Code (GitHub Copilot), Windsurf, and Cline.
Is DataHub free to use?
Yes, DataHub is open source and available under the Apache-2.0 license. You can use it freely in both personal and commercial projects.
DataHub Alternatives — Similar Databases Servers
Looking for alternatives to DataHub? Here are other popular databases servers you can use with Claude, Cursor, and VS Code.
Excelize
★ 20.6kGo language library for reading and writing Microsoft Excel™ (XLAM / XLSM / XLSX / XLTM / XLTX) spreadsheets
MCP Toolbox for Databases
★ 15.3kOpen source MCP server specializing in easy, fast, and secure tools for Databases.
DBHub
★ 2.8kA universal database gateway MCP server that enables AI assistants to connect to and query multiple databases (PostgreSQL, MySQL, MariaDB, SQL Server, SQLite) with support for schema exploration, SQL execution, and secure connections via SSH tunnels.
Tabularis
★ 2.1kA lightweight, cross-platform database client for developers. Supports MySQL, PostgreSQL and SQLite. Hackable with plugins. Built for speed, security, and aesthetics.
Postgres AI Guide
★ 1.7kMCP server and Claude plugin for Postgres skills and documentation. Helps AI coding tools generate better PostgreSQL code.
Anyquery
★ 1.7k🏎️ 🏠 ☁️ - Query more than 40 apps with one binary using SQL. It can also connect to your PostgreSQL, MySQL, or SQLite compatible database. Local-first and private by design.
Browse More Databases MCP Servers
Explore all databases servers available in the MCPgee directory. Each server includes setup guides for Claude, Cursor, and VS Code.
Set Up DataHub in Your Editor
Choose your AI client for step-by-step setup instructions.
Quick Config Preview
Add this to your claude_desktop_config.json or .cursor/mcp.json
Ready to use DataHub?
Browse our complete directory of 33,000+ MCP servers, read setup guides for your editor, and start building with the Model Context Protocol.