- Explore MCP Servers
- web-runner-mcp-llm
Web Runner Mcp Llm
What is Web Runner Mcp Llm
Web-Runner-mcp-llm is a Python project that leverages Playwright’s browser automation capabilities, making it accessible for AI agents and applications through the Model Context Protocol (MCP).
Use cases
Use cases include automating data collection from websites, performing automated testing of web applications, and enabling AI agents to interact with web content dynamically.
How to use
To use web-runner-mcp-llm, set up the environment, start the server, create JSON data for operations, and execute commands either via command line or a GUI client. Detailed steps are provided in the README.
Key features
Key features include support for various browser actions, PDF text extraction, and robust error handling, enabling efficient web interactions.
Where to use
Web-Runner-mcp-llm can be used in fields such as AI development, web scraping, automated testing, and any application requiring web interaction.
Clients Supporting MCP
The following are the main client software that supports the Model Context Protocol. Click the link to visit the official website for more information.
Overview
What is Web Runner Mcp Llm
Web-Runner-mcp-llm is a Python project that leverages Playwright’s browser automation capabilities, making it accessible for AI agents and applications through the Model Context Protocol (MCP).
Use cases
Use cases include automating data collection from websites, performing automated testing of web applications, and enabling AI agents to interact with web content dynamically.
How to use
To use web-runner-mcp-llm, set up the environment, start the server, create JSON data for operations, and execute commands either via command line or a GUI client. Detailed steps are provided in the README.
Key features
Key features include support for various browser actions, PDF text extraction, and robust error handling, enabling efficient web interactions.
Where to use
Web-Runner-mcp-llm can be used in fields such as AI development, web scraping, automated testing, and any application requiring web interaction.
Clients Supporting MCP
The following are the main client software that supports the Model Context Protocol. Click the link to visit the official website for more information.
Content
Web-Runner-mcp: Advanced Web Browser Operation Protocol for AI
Web-Runner-mcp is a Python project designed to make Playwright’s powerful browser automation capabilities easily accessible to AI agents and other applications through the standardized Model Context Protocol (MCP).

Table of Contents
- Overview
- Why Web-Runner-mcp?
- Key Features
- Usage
- JSON Format (Reference)
- Comparison with Other Tools
- Future Plans
- Contributing
- License
Overview
Information gathering and interaction with the web are essential for today’s AI agents, but existing tools have limitations. While simple content retrieval or fetching search result lists is possible, tasks like interacting with login-required sites, handling pages rendered with complex JavaScript, navigating iframe structures, and processing PDF content remain challenging. Furthermore, reliably controlling low-level APIs like Playwright directly from Large Language Models (LLMs) presents a significant hurdle.
Web-Runner-mcp proposes a new approach to tackle these challenges.
Instead of instructing the LLM to perform individual browser operations, Web-Runner-mcp allows you to define a sequence of desired operations in a JSON format and pass it to an MCP server for execution. The current version executes these operations reliably based on the JSON file instructions, without direct LLM involvement in the browser control loop itself.
This might be a “small revolution” in how AI interacts with the web, opening doors to the deeper, more complex parts of the web that were previously inaccessible to AI.
Why Web-Runner-mcp?
- Advanced Web Operations:
- Login: Access and interact with websites requiring authentication.
- PDF: Download linked PDFs and extract their text content.
- Iframe: Explore and interact with elements within nested iframes (dynamic discovery).
- Multiple Tabs/Pages: Follow new pages opened by clicks.
- Dynamic Content: Wait for and interact with elements generated by JavaScript.
- Versatile Data Extraction:
- Flexible text/HTML retrieval using
innerText,textContent,innerHTML. - Get specific attribute values using
getAttribute. - Efficient data collection from multiple elements using
getAllAttributes,getAllTextContents(with dynamic iframe discovery).
- Flexible text/HTML retrieval using
- Declarative Operation Definition:
- Describe the desired steps in JSON.
- Ensures reproducibility and simplifies debugging.
- MCP Compliance:
- Standardized protocol enables integration with various MCP clients (Dify custom tools, Python AI agent frameworks, custom clients, etc.).
- Separates client and server concerns.
- Reliable Execution:
- Stable browser operations powered by Playwright.
- Appropriate waiting mechanisms and error handling.
Key Features
- MCP Server (
web_runner_mcp_server.py): Implemented in Python (based onFastMCP), exposes Web-Runner functionality as theexecute_web_runnertool. - Web-Runner Core (
playwright_handler.py,utils.py,config.py): Uses Playwright (async) to execute browser operations based on input JSON. Handles core logic, settings, utility functions, dynamic iframe discovery, and PDF processing. - Web-Runner Standalone Execution (
main.py): An entry point for running Web-Runner directly from the command line without the MCP server (for debugging and unit testing). - MCP Client Core (
web_runner_mcp_client_core.py): Provides the core function (execute_web_runner_via_mcp) for invoking the MCP server programmatically (e.g., from AI agents). - GUI Client (
web_runner_mcp_client_GUI.py): A convenient graphical interface for selecting JSON files, running tasks manually, and launching the JSON generator.
Supported Actions
click: Clicks an element.input: Enters text into an element.hover: Hovers over an element.get_inner_text,get_text_content,get_inner_html: Gets text/HTML (single element).get_attribute: Gets an attribute value (single element).get_all_attributes,get_all_text_contents: Gets attribute values/text content as a list (multiple elements, searches within iframes).wait_visible: Waits for an element to become visible.select_option: Selects an option from a dropdown list.screenshot: Saves a screenshot of the page or an element (server-side).scroll_page_to_bottom,scroll_to_element: Performs scroll operations.wait_page_load: Waits for the page to finish loading.sleep: Pauses execution for a specified duration.switch_to_iframe,switch_to_parent_frame: Moves focus between iframes (explicitly specified).
PDF Text Extraction
Automatically downloads PDFs linked via get_attribute(href=...) or get_all_attributes(href=...) and includes the extracted text in the results.
Error Handling
Records error information for each step, including the screenshot path (on the server’s filesystem) if an error occurs.
Usage
1. Setup
(1) Clone the repository:
git clone https://github.com/sinzy0925/web-runner-mcp.git
cd web-runner-mcp
(2) Prepare Python environment (Python 3.12+ recommended):
# Create a virtual environment (e.g., venv312)
python -m venv venv312
# Activate the virtual environment
# Windows PowerShell
.\venv312\Scripts\Activate
# Linux/macOS
source venv312/bin/activate
(3) Install dependencies:
Install using the requirements.txt file.
pip install -r requirements.txt
(4) Install Playwright browsers:
playwright install
2. Starting the Server (SSE Mode Example)
Note: This mode has not been fully verified and might require adjustments.
To allow access over the network (e.g., for Dify integration), start the server in SSE mode.
# Run web_runner_mcp_server.py directly
python web_runner_mcp_server.py --transport sse --host 0.0.0.0 --port 8000
- Use
--host 0.0.0.0to allow access from other machines. Use127.0.0.1(default) for local access only. --port 8000specifies the port the server listens on.- Server logs are output to
web_runner_mcp_server.log(default setting).
3. Creating JSON Data for Web-Runner
You can use the included json_generator.html to interactively create the JSON file in your browser.
Step 1: Prepare the JSON Generator
- Open the
json_generator.htmlfile located in the project folder with your web browser (double-click).
Step 2: Get CSS Selectors for Target Elements
- Open the target website you want to automate in a separate browser tab or window.
- Open the developer tools on that page (usually F12 key or right-click > “Inspect”/“Inspect Element”).
- Click the element selection icon (↖) in the developer tools.
- Click the element you want to interact with (button, input field, etc.) on the webpage.
- In the developer tools, right-click the highlighted HTML element and select [Copy] > [Copy selector].
Step 3: Create Operation Steps in json_generator.html
- Go back to the
json_generator.htmltab. - Enter the website’s URL in “1. Target URL:”.
- In “2. Operation Steps”, fill in the following:
- Target Element CSS Selector: Paste the selector you copied.
- Operation: Choose the desired action.
- Additional Parameters: Enter values if needed (e.g.,
value,attribute_name).
- Click “Add Step” and repeat step 3 for all required actions.
- Click “Generate JSON Data” to see the generated JSON.
- Click “Download input.json” to save the JSON file.
Step 4: Place the JSON File
- Move the downloaded JSON file into the
json/folder within the project directory. You can rename the file as needed (e.g.,my_task.json).
4. Command-Line Execution (for Testing)
You can test the Web-Runner directly from the command line using the core client function (web_runner_mcp_client_core.py) without the GUI. This is useful for verifying programmatic calls, like those from an AI agent.
- Ensure your desired JSON file is in the
json/folder (e.g.,tdnet.json). - Run the following command in your activated terminal:
python web_runner_mcp_client_core.py --jsonfile json/tdnet.json --no-headless --slowmo 500
--jsonfile: Specifies the path to the JSON file to execute (default:json/tdnet.json).--no-headless: Use this flag to display the browser during execution (default is visible). Use--headlessto run in the background.--slowmo: (Optional) Adds a delay (in milliseconds) between operations (e.g.,--slowmo 500).--output: (Optional) Specifies the path for the output file (default:output_web_runner.txt).
The execution results (successful data retrieval or error information) will be printed to the console in JSON format and also written to the specified output file.
5. Running from the GUI Client
For manual testing and debugging, the GUI client (web_runner_mcp_client_GUI.py) is convenient.
- Run the following command in your activated terminal:
python web_runner_mcp_client_GUI.py
- In the application window, select the desired JSON file from the dropdown list.
- Click the “実行 ▶” (Run) button.
- The execution results will be displayed in the text area below.
- You can also click the “JSONジェネレーター” (JSON Generator) button to open
json_generator.html.
6. Usage from AI Applications
To use Web-Runner-mcp from other Python scripts or AI agent frameworks, import and use the execute_web_runner_via_mcp function from web_runner_mcp_client_core.py.
import asyncio
import json
import sys # Add sys import
# Ensure web_runner_mcp_client_core.py is in the import path
try:
from web_runner_mcp_client_core import execute_web_runner_via_mcp
except ImportError:
print("Error: web_runner_mcp_client_core.py not found.")
# Error handling or path configuration needed
sys.exit(1) # Example
async def run_task():
input_data = {
"target_url": "https://example.com",
"actions": [
{"action": "get_text_content", "selector": "h1"},
{"action": "get_attribute", "selector": "img", "attribute_name": "src"}
]
# Optionally specify timeouts etc.
# "default_timeout_ms": 15000
}
# Execute in headless mode with 50ms slow motion
success, result_or_error = await execute_web_runner_via_mcp(
input_data, headless=True, slow_mo=50 # Specify headless, slow_mo
)
if success and isinstance(result_or_error, str):
print("Task successful! Result (JSON):")
try:
result_dict = json.loads(result_or_error)
print(json.dumps(result_dict, indent=2, ensure_ascii=False))
# --- Process the results, potentially pass to an LLM ---
# llm_prompt = f"Analyze the following website operation results:\n```json\n{result_or_error}\n```"
# llm_response = await call_llm(llm_prompt)
except json.JSONDecodeError:
print("Error: Response from server is not valid JSON:")
print(result_or_error)
else:
print("Task failed:")
print(result_or_error) # Display error information (dictionary)
# --- Process the error information, potentially pass to an LLM ---
# error_prompt = f"Website operation failed. Error details:\n{result_or_error}\nInfer the cause."
# llm_response = await call_llm(error_prompt)
if __name__ == "__main__":
asyncio.run(run_task())
JSON Format (Reference)
Refer to the JSON files provided in the json/ folder for examples.
Here is the basic structure of the input JSON:
Comparison with Other Tools
- General Web Scraping Libraries (BeautifulSoup, Scrapy): Excellent for parsing static HTML, but struggle with or cannot handle JavaScript execution, logins, complex user interactions, iframes, and PDFs. Web-Runner-mcp, being Playwright-based, handles these advanced operations.
- Playwright-MCP: Exposes Playwright’s low-level API directly as MCP tools. Highly flexible, but requires complex prompt engineering and state management for reliable control from LLMs. Web-Runner-mcp offers a more declarative and reliable interface by defining operation sequences in JSON.
- Simple Web Fetching Tools (e.g., URL content fetchers): Easy for getting content from a single URL, but incapable of multi-step operations or interactions. Web-Runner-mcp executes multi-step workflows.
Future Plans
- LLM-Powered JSON Generation: Integrate functionality to automatically generate Web-Runner JSON from natural language instructions.
- Expanded Action Support: Add support for more Playwright features (e.g., file uploads, cookie manipulation).
- Official Dify Custom Tool Support: Stabilize the HTTP/SSE interface aiming for potential registration in the Dify marketplace.
- Enhanced Error Handling and Recovery: Implement more detailed error analysis and potentially automatic retry/recovery mechanisms.
Contributing
Bug reports, feature suggestions, and pull requests are welcome! Please see CONTRIBUTING.md for details (to be created if not present).
License
This project is licensed under the MIT License - see the LICENSE file for details.
Dev Tools Supporting MCP
The following are the main code editors that support the Model Context Protocol. Click the link to visit the official website for more information.










