Crawl Mcp

1 MIT

FreeCommunity

AI Systems

Crawl4AI MCP Server enables AI models to recursively crawl websites and extract content in markdown.

What is Crawl Mcp

Crawl-mcp is a web crawler MCP server built using crawl4ai, designed to enable AI models to crawl websites and extract content in markdown format.

Use cases

Use cases for crawl-mcp include gathering data for research, collecting content for machine learning models, and aggregating information from multiple web sources for analysis.

How to use

To use crawl-mcp, install it via uv, pipx, or pip. After installation, run the server using the command ‘pipx run crawl4ai-mcp’ or ‘python -m Crawl_mcp.main’. Configure it using the provided mcp.json file for MCP-compatible clients.

Key features

Key features include recursive crawling, configurable depth, page limit settings, and customizable timeout configurations for long crawling operations.

Where to use

Crawl-mcp can be used in various fields such as data mining, web scraping, content aggregation, and AI training where extracting structured data from websites is necessary.

Clients Supporting MCP

The following are the main client software that supports the Model Context Protocol. Click the link to visit the official website for more information.

Claude Desktop: Official desktop application from Anthropic, natively supports MCP protocol. claude.ai

Cherry Studio: Cross-platform desktop client supporting multiple LLM providers, built-in MCP server support. cherry-ai.com

LobeChat: Modern open-source ChatGPT/LLMs UI, supports MCP protocol integration. lobehub.com

DeepChat: Cross-platform desktop AI assistant, compatible with MCP protocol, focusing on privacy and efficiency. deepchat.thinkinai.xyz

5ire: Cross-platform open-source desktop intelligent assistant MCP client, supports local knowledge base and MCP server. 5ire.app

View More MCP Clients

Overview

What is Crawl Mcp

Crawl-mcp is a web crawler MCP server built using crawl4ai, designed to enable AI models to crawl websites and extract content in markdown format.

Use cases

Use cases for crawl-mcp include gathering data for research, collecting content for machine learning models, and aggregating information from multiple web sources for analysis.

How to use

Key features

Key features include recursive crawling, configurable depth, page limit settings, and customizable timeout configurations for long crawling operations.

Where to use

Crawl-mcp can be used in various fields such as data mining, web scraping, content aggregation, and AI training where extracting structured data from websites is necessary.

Clients Supporting MCP

The following are the main client software that supports the Model Context Protocol. Click the link to visit the official website for more information.

Claude Desktop: Official desktop application from Anthropic, natively supports MCP protocol. claude.ai

Cherry Studio: Cross-platform desktop client supporting multiple LLM providers, built-in MCP server support. cherry-ai.com

LobeChat: Modern open-source ChatGPT/LLMs UI, supports MCP protocol integration. lobehub.com

DeepChat: Cross-platform desktop AI assistant, compatible with MCP protocol, focusing on privacy and efficiency. deepchat.thinkinai.xyz

5ire: Cross-platform open-source desktop intelligent assistant MCP client, supports local knowledge base and MCP server. 5ire.app

View More MCP Clients

Content

Crawl4AI MCP Server

A web crawler MCP server using crawl4ai that allows AI models to crawl websites and extract content in markdown format.

Features

Recursive Crawling: Recursively crawl websites starting from a URL
Configurable Depth: Control how deep the crawler should go
Page Limit: Set maximum number of pages to crawl
Timeout Configuration: Set custom timeout for long crawling operations

Installation

Using uv (Recommended)

The fastest way to install and use the Crawl4AI MCP server is with uv:

# Install directly from GitHub
uv pip install git+https://github.com/Kavin-kumar10/crawl4ai-mcp.git

# Or create a virtual environment first
uv venv -p python3.10 .venv
source .venv/bin/activate  # On Unix/macOS
# or
.venv\Scripts\activate  # On Windows
uv pip install git+https://github.com/Kavin-kumar10/crawl4ai-mcp.git

Using pipx

You can also install it with pipx for an isolated environment:

pipx install git+https://github.com/Kavin-kumar10/crawl4ai-mcp.git

This will install the package in an isolated environment and make the crawl4ai-mcp command available globally.

Using pip

You can also install it with pip:

pip install git+https://github.com/Kavin-kumar10/crawl4ai-mcp.git

From Source

To install from source:

git clone https://github.com/Kavin-kumar10/crawl4ai-mcp.git
cd crawl4ai-mcp

# Using uv (recommended)
uv pip install -e .

# Or using pip
pip install -e .

Usage with MCP

Running the Server

Once installed, you can run the MCP server directly:

# If installed with pipx
pipx run crawl4ai-mcp

# If installed with pip or uv
python -m Crawl_mcp.main

The server uses stdio transport (not URL-based) for communication with MCP clients.

Using the mcp.json Configuration

This repository includes an mcp.json file that you can use to configure the MCP server in MCP-compatible clients:

# Copy it to your home directory or project directory
cp mcp.json ~/.mcp.json

The mcp.json file contains:

Server configuration with stdio transport
Tool definitions with parameters and return types
Descriptions for tools and parameters

Available Tools

The MCP server provides the following tool:

crawl_recursive

Recursively crawl a website starting from a URL.

Parameters:

url (string): The URL to start crawling from
max_depth (integer, optional): Maximum depth for recursive crawling (default: 2)
max_pages (integer, optional): Maximum number of pages to crawl (default: 500)

Example:

{
  "url": "https://example.com",
  "max_depth": 3,
  "max_pages": 100
}

Example Usage with Claude

Here’s an example of how to use the MCP server with Claude after configuring it in your MCP environment:

I need to use the Crawl4AI MCP server to crawl a website.

<use_mcp_tool>
<server_name>crawl4ai-mcp</server_name>
<tool_name>crawl_recursive</tool_name>
<arguments>
{
  "url": "https://example.com",
  "max_depth": 2,
  "max_pages": 100
}
</arguments>
</use_mcp_tool>

Note: The MCP server uses stdio transport for communication, not HTTP/URL-based transport. This means it runs in the terminal and communicates through standard input/output streams.

Development

Requirements

Python 3.10 or higher
MCP CLI 1.7.1 or higher (pip install mcp[cli]>=1.7.1)
crawl4ai (pip install crawl4ai)

Development Setup

Set up your development environment:

# Clone the repository
git clone https://github.com/Kavin-kumar10/crawl4ai-mcp.git
cd crawl4ai-mcp

# Create a virtual environment with uv
uv venv -p python3.10 .venv
source .venv/bin/activate  # On Unix/macOS
# or
.venv\Scripts\activate  # On Windows

# Install in development mode
uv pip install -e .

# Install development dependencies (if you have any)
uv pip install -e ".[dev]"

Testing

To test the MCP server locally:

# Run the server in development mode with the MCP Inspector
mcp dev Crawl_mcp/server.py

# Or run the module directly (stdio mode)
python -m Crawl_mcp.main

When running in stdio mode, the server expects JSON-RPC messages on stdin and writes responses to stdout. This is how MCP clients communicate with the server.

Dependency Management

Use uv to manage dependencies:

# Check for dependency conflicts
uv pip check

# Generate a lock file (if you have a requirements.in file)
uv pip compile requirements.in -o requirements.txt

License

MIT License

Dev Tools Supporting MCP

The following are the main code editors that support the Model Context Protocol. Click the link to visit the official website for more information.

Zed: High-performance collaborative code editor, supports MCP protocol, providing a smooth programming experience. zed.dev

Cursor: AI code editor built on VS Code, supports MCP protocol for context-aware programming. cursor.com

Windsurf: AI code editor from Codeium, integrates MCP protocol to provide intelligent code assistance. windsurf.com

Continue: Open-source AI programming assistant plugin, supports VS Code and JetBrains, compatible with MCP protocol. continue.dev

Trae: AI-driven code editor, supports MCP protocol, focusing on enhancing developer programming experience. trae.ai

View More MCP Dev Tools

Tools

No tools

Comments

Recommend MCP Servers

Tavily MCP Server The Tavily MCP server provides: search, extract, map, crawl tools Real-time web search capabilities through the tavily-search tool Intelligent data extraction from web pages via the tavily-extract tool Powerful web mapping tool that creates a structured map of website Web crawler that systematically explores websites.

MCP Server Chart This is a TypeScript-based MCP server that provides chart generation capabilities. It allows you to create various types of charts through MCP tools. You can also use it in Dify.

GitHub MCP Server MCP Server for the GitHub API, enabling file operations, repository management, search functionality, and more.

Brave Search MCP Server Web and local search using Brave's Search API

Firecrawl MCP Server Advanced web scraping with JavaScript rendering, PDF support, and smart rate limiting

Context7 MCP LLMs rely on outdated or generic information about the libraries you use. You get:

Slack MCP server Channel management and messaging capabilities

Sequential Thinking MCP Server Dynamic and reflective problem-solving through thought sequences

Fetch MCP Server A Model Context Protocol server that provides web content fetching capabilities.

Playwright MCP A Model Context Protocol (MCP) server that provides browser automation capabilities using [Playwright](https://playwright.dev). This server enables LLMs to interact with web pages through structured accessibility snapshots, bypassing the need for screenshots or visually-tuned models.

View All MCP Servers