- Explore MCP Servers
- dataproduct-mcp
Dataproduct Mcp
What is Dataproduct Mcp
The dataproduct-mcp is an MCP server designed to manage Data Products and Contracts, facilitating AI-assisted data discovery, querying, and analysis.
Use cases
Use cases include AI assistants exploring data products, analyzing data contract schemas, executing SQL queries, joining data for complex analyses, validating data quality, and building data-aware AI applications.
How to use
To use dataproduct-mcp, set the location of your data files and start the server using the provided commands. You can also run it in development mode for testing and debugging.
Key features
Key features include asset management, contract compliance, smart data querying using natural language, federated queries across multiple data products, flexible identification formats, local and remote support, and a plugin architecture for extensibility.
Where to use
dataproduct-mcp can be used in various fields such as data science, AI development, and any domain requiring data management and analysis.
Clients Supporting MCP
The following are the main client software that supports the Model Context Protocol. Click the link to visit the official website for more information.
Overview
What is Dataproduct Mcp
The dataproduct-mcp is an MCP server designed to manage Data Products and Contracts, facilitating AI-assisted data discovery, querying, and analysis.
Use cases
Use cases include AI assistants exploring data products, analyzing data contract schemas, executing SQL queries, joining data for complex analyses, validating data quality, and building data-aware AI applications.
How to use
To use dataproduct-mcp, set the location of your data files and start the server using the provided commands. You can also run it in development mode for testing and debugging.
Key features
Key features include asset management, contract compliance, smart data querying using natural language, federated queries across multiple data products, flexible identification formats, local and remote support, and a plugin architecture for extensibility.
Where to use
dataproduct-mcp can be used in various fields such as data science, AI development, and any domain requiring data management and analysis.
Clients Supporting MCP
The following are the main client software that supports the Model Context Protocol. Click the link to visit the official website for more information.
Content
Data Product MCP Server
An MCP server designed to discover and query Data Products on any data platform in a governed way - enabling AI-assistants to answer any business question.

Overview
The Data Product MCP Server provides an interface for AI clients to discover and query data products via MCP.
It enables AI agents to locate, analyze, and execute queries data to answer business questions in natural language on all available business data products.
Example
What was the revenue in our webshop yesterday? What about cancellations? What were the reasons for them?
Key Features
- Discovery: Find relevant data products. Data contracts provide the necessary schema and semantic context information.
- Governance Enforcement: Automate access management and checks for terms of use.
- Data Querying: Natural language questions are translated into SQL, executed, and augumented.
- Federated Queries (Alpha): Join data across multiple data products from different sources and technologies.
- Local & Remote Support: Work with local data product repositories or remote data product marketplaces, such as Data Mesh Manager
- Plugin Architecture: Extensible plugin system for adding new asset sources and data sources
Use Cases
- AI assistants exploring available data products
- Analyzing data contract schemas and requirements
- Executing SQL queries against data products
- Joining data across multiple products for complex analyses
- Validating data quality against contract specifications
- Building data-aware AI applications
Installation
Prerequisites
- Python 3.11 or higher
- uv (recommended) or pip
Quick Install
Using uv (recommended):
uv pip install -e .
Using pip:
pip install -e .
Running the Server
Basic Usage
# Set location of data files
export DATAASSET_SOURCE=/path/to/assets/directory
# Start the server
python -m src.dataproduct_mcp.server
Development Mode
# For local development with MCP CLI
mcp dev -m src.dataproduct_mcp.server
# For debugging (runs test query)
python -m src.dataproduct_mcp.server
# For server mode
python -m src.dataproduct_mcp.server --server
Integration with Claude Code or Claude Desktop
Add this configuration to your Claude installation:
{
"mcpServers": {
"dataproduct": {
"command": "uv",
"args": [
"run",
"--directory",
"<path_to_folder>/dataproduct-mcp",
"python",
"-m",
"dataproduct_mcp.server"
],
"env": {
"DATAASSET_SOURCE": "<path_to_folder>/dataproduct-mcp/examples"
}
}
}
}
Tools and Capabilities
Data Product Operations
| Tool | Description |
|---|---|
dataproducts_list |
List all available data products |
dataproducts_get |
Retrieve a specific data product by identifier |
dataproducts_get_output_schema |
Get schema for a data contract linked to a product output port |
dataproducts_query |
Execute SQL queries against one or multiple data product output ports |
Examples
List Available Data Products
dataproducts_list
Get a Specific Data Product
dataproducts_get(identifier="local:product/orders.dataproduct.yaml")
dataproducts_get(identifier="datameshmanager:product/customers")
Get Schema for a Data Contract
dataproducts_get_output_schema(identifier="local:contract/orders.datacontract.yaml")
dataproducts_get_output_schema(identifier="datameshmanager:contract/snowflake_customers_latest_npii_v1")
Query a Single Data Product
dataproducts_query( sources=[ { "product_id": "local:product/orders.dataproduct.yaml" } ], query="SELECT * FROM orders LIMIT 10;", include_metadata=True )
Execute a Federated Query (Alpha)
dataproducts_query( sources=[ { "product_id": "local:product/orders.dataproduct.yaml", "alias": "orders" }, { "product_id": "local:product/video_history.dataproduct.yaml", "alias": "videos" } ], query="SELECT o.customer_id, v.video_id, o.order_date, v.view_date FROM orders o JOIN videos v ON o.customer_id = v.user_id WHERE o.order_date > '2023-01-01'", include_metadata=True )
Note: Querying multiple data products at once is still in alpha status and may have limitations.
Included Examples
The examples directory contains sample files for both data products and their supporting data contracts:
Sample Data Products
orders.dataproduct.yaml- Customer order datashelf_warmers.dataproduct.yaml- Products with no sales in 3+ monthsvideo_history.dataproduct.yaml- Video viewing history data
Sample Data Contracts
orders.datacontract.yaml- Defines order data structure and rulesshelf_warmers.datacontract.yaml- Contract for product inventory analysisvideo_history.datacontract.yaml- Specifications for video consumption data
Architecture
The Data Contract MCP Server uses a plugin-based architecture that cleanly separates:
Asset Sources (Metadata)
These plugins handle loading and listing data assets (products and contracts):
- Local files: Loads YAML files from a local directory
- Data Mesh Manager: Fetches assets from a Data Mesh Manager API
Data Sources (Query Execution)
These plugins handle the actual data querying:
- Local files: Queries CSV, JSON, or Parquet files using DuckDB
- S3: Queries data in S3 buckets using DuckDB’s S3 integration
- Databricks: Queries data from Databricks SQL warehouses
The plugin architecture makes it easy to add support for additional asset sources (e.g., Git repositories, other APIs) and data sources (e.g., databases, data warehouses).
Configuration Options
Environment Variables
| Variable | Description | Default |
|---|---|---|
DATAASSET_SOURCE |
Directory containing data assets | Current directory |
DATAMESH_MANAGER_API_KEY |
API key for Data Mesh Manager | None |
DATAMESH_MANAGER_HOST |
Host URL for Data Mesh Manager | https://api.datamesh-manager.com |
AWS S3 Configuration (for S3 data sources)
AWS_REGION/AWS_DEFAULT_REGION- AWS region (default:us-east-1)S3_BUCKETS- Allowed S3 buckets (comma-separated)- Authentication via profile (
AWS_PROFILE) or credentials (AWS_ACCESS_KEY_ID/AWS_SECRET_ACCESS_KEY)
Databricks Configuration (for Databricks data sources)
DATABRICKS_WORKSPACE_URL- Databricks workspace URL (required)DATABRICKS_TOKEN- Personal access token for DatabricksDATABRICKS_CLIENT_ID/DATABRICKS_CLIENT_SECRET- OAuth client credentials (alternative to token)DATABRICKS_CATALOG- Default catalog to use (optional)DATABRICKS_SCHEMA- Default schema to use (optional)DATABRICKS_TIMEOUT- Query execution timeout in seconds (default: 120)
Development Setup
Python base interpreter should be 3.11.x.
# create venv
python3.11 -m venv venv
source venv/bin/activate
# Install Requirements
pip install --upgrade pip setuptools wheel
pip install -e '.[dev]'
pre-commit install
pre-commit run --all-files
pytest
Use uv (recommended)
# make sure uv is installed
uv python pin 3.11
uv pip install -e '.[dev]'
uv run ruff check
uv run pytest
Contribution
We are happy to receive your contributions. Propose your change in an issue or directly create a pull request with your improvements.
Related Tools
- Data Contract CLI is an open-source command-line tool for working with data contracts.
- Data Contract Manager is a commercial tool to manage data contracts. It contains a web UI, access management, and data governance for a full enterprise data marketplace.
- Data Contract GPT is a custom GPT that can help you write data contracts.
- Data Contract Editor is an editor for Data Contracts, including a live html preview.
- Data Contract Playground allows you to validate and export your data contract to different formats within your browser.
License
Dev Tools Supporting MCP
The following are the main code editors that support the Model Context Protocol. Click the link to visit the official website for more information.










