Inbrowsermcp

@LofiSuon a year ago

9 MIT

FreeCommunity

AI Systems

InBrowser MCP is a browser automation toolkit with a server and extension for AI-controlled actions.

What is Inbrowsermcp

InBrowserMcp is a browser automation toolkit that includes a server and a browser extension for automating tasks within web browsers.

Use cases

Use cases include automated testing of web applications, scraping data from websites, and automating user interactions for better efficiency.

How to use

To use InBrowserMcp, first install the server and frontend dependencies using pnpm. Then, start the server on port 9000 and the frontend development server, typically accessible at http://localhost:5173.

Key features

Key features include browser automation capabilities such as clicking, inputting text, and taking screenshots, AI control for browser behavior, and an extensible tool system for adding new functionalities.

Where to use

InBrowserMcp can be used in various fields such as web testing, automated data entry, and any scenario requiring repetitive browser interactions.

Clients Supporting MCP

The following are the main client software that supports the Model Context Protocol. Click the link to visit the official website for more information.

Claude Desktop: Official desktop application from Anthropic, natively supports MCP protocol. claude.ai

Cherry Studio: Cross-platform desktop client supporting multiple LLM providers, built-in MCP server support. cherry-ai.com

LobeChat: Modern open-source ChatGPT/LLMs UI, supports MCP protocol integration. lobehub.com

DeepChat: Cross-platform desktop AI assistant, compatible with MCP protocol, focusing on privacy and efficiency. deepchat.thinkinai.xyz

5ire: Cross-platform open-source desktop intelligent assistant MCP client, supports local knowledge base and MCP server. 5ire.app

View More MCP Clients

Overview

What is Inbrowsermcp

InBrowserMcp is a browser automation toolkit that includes a server and a browser extension for automating tasks within web browsers.

Use cases

Use cases include automated testing of web applications, scraping data from websites, and automating user interactions for better efficiency.

How to use

Key features

Where to use

InBrowserMcp can be used in various fields such as web testing, automated data entry, and any scenario requiring repetitive browser interactions.

Clients Supporting MCP

The following are the main client software that supports the Model Context Protocol. Click the link to visit the official website for more information.

Claude Desktop: Official desktop application from Anthropic, natively supports MCP protocol. claude.ai

Cherry Studio: Cross-platform desktop client supporting multiple LLM providers, built-in MCP server support. cherry-ai.com

LobeChat: Modern open-source ChatGPT/LLMs UI, supports MCP protocol integration. lobehub.com

DeepChat: Cross-platform desktop AI assistant, compatible with MCP protocol, focusing on privacy and efficiency. deepchat.thinkinai.xyz

5ire: Cross-platform open-source desktop intelligent assistant MCP client, supports local knowledge base and MCP server. 5ire.app

View More MCP Clients

Content

InBrowserMcp - （开发中）

InBrowserMcp 是一个实验性项目，旨在通过 Chrome 扩展程序将模型上下文协议 (MCP) 的能力引入浏览器内部操作。

它允许 AI 模型或外部应用程序通过 MCP 标准接口向浏览器发送指令（如导航、点击、输入、获取内容等），并通过 Chrome 扩展在浏览器中实际执行这些操作。

项目结构

本项目包含三个主要部分：

mcp-server: 后端服务 (Node.js/Express)
- 实现了 MCP 服务器，处理来自客户端（如 AI 模型）的 JSON-RPC 请求。
- 通过 WebSocket 与 Chrome 扩展 (extension) 通信，将 MCP 指令转发给扩展，并接收执行结果。
- 提供 /mcp 端点用于 MCP 通信 (POST/GET/DELETE)。
- 提供 /api/ai-command 端点用于接收前端 UI 发送的自然语言指令，并将其（模拟地）转换为 MCP 指令。
- 提供 /api/cancel-command 端点用于取消正在进行的浏览器操作。
extension: Chrome 浏览器扩展
- 作为浏览器端的代理，通过 WebSocket 连接到 mcp-server。
- 接收来自 mcp-server 的指令。
- 使用 Chrome Extension API (e.g., chrome.tabs, chrome.windows, chrome.debugger) 在浏览器中执行指令。
- 将执行结果或错误通过 WebSocket 返回给 mcp-server。
frontend: 前端用户界面 (React/Vite)
- 提供一个简单的 Web 界面，用户可以在此输入自然语言指令。
- 通过 HTTP POST 请求将指令发送到 mcp-server 的 /api/ai-command 端点。
- 通过 Server-Sent Events (SSE) 从 /mcp 端点接收 MCP 服务器的状态更新和操作结果，并更新 UI。

技术栈

后端 (mcp-server): Node.js, Express, TypeScript, @modelcontextprotocol/sdk, ws (WebSocket)
扩展 (extension): JavaScript, Chrome Extension API, WebSocket
前端 (frontend): React, Vite, TypeScript, Tailwind CSS, Server-Sent Events (EventSource)

工作流程

用户在 frontend UI 中输入指令（例如“打开 google.com”）。
frontend 将指令 POST 到 mcp-server 的 /api/ai-command。
mcp-server (模拟 AI) 将自然语言指令转换为 MCP 工具调用请求（例如 navigate 工具）。
mcp-server 通过 WebSocket (sendBrowserAction) 将指令和参数发送给 extension。
extension (在 background.js 中) 接收指令，调用相应的 Chrome API 执行操作（例如 chrome.tabs.update(...)）。
extension 将操作结果（成功或失败信息）通过 WebSocket (action_response) 发送回 mcp-server。
mcp-server 收到 action_response，解析结果。
mcp-server 通过与 frontend 建立的 SSE 连接 (/mcp GET) 发送 message 或 error 事件。
frontend 接收 SSE 事件，更新 UI 状态（例如从“运行中”变为“成功”或“失败”）。

安装与运行

先决条件:

Node.js (建议 v18 或更高版本)
npm 或 pnpm
Google Chrome 浏览器

步骤:

克隆仓库:

git clone <repository-url>
cd InBrowserMcp

安装后端依赖:

cd mcp-server
npm install
# 或者使用 pnpm
# pnpm install
cd ..

安装前端依赖:

cd frontend
pnpm install # 推荐使用 pnpm，如果使用 npm，请运行 npm install
cd ..

加载 Chrome 扩展:
- 打开 Chrome 浏览器，地址栏输入 chrome://extensions/。
- 启用右上角的“开发者模式”。
- 点击“加载已解压的扩展程序”。
- 选择项目中的 InBrowserMcp/extension 文件夹。
- 确保扩展已启用。
启动后端服务:
```
cd mcp-server
npm run build
npm start
```
后端服务默认运行在 http://localhost:8080，WebSocket 服务器监听 ws://localhost:8081。
启动前端开发服务器:
```
cd frontend
pnpm run dev
```
前端应用默认运行在 http://localhost:5173。
使用:
- 打开浏览器访问 http://localhost:5173。
- 在输入框中输入指令，例如“打开 google.com”或“导航到 bilibili.com”。
- 观察状态变化和浏览器行为。

注意事项

Chrome 扩展需要必要的权限才能执行某些操作（例如访问标签页、窗口、调试器等），这些权限在 extension/manifest.json 中配置。
后端和扩展之间的 WebSocket 连接是核心通信渠道。
前端通过 SSE 与后端通信以获取实时更新。
当前的 /api/ai-command 只是一个简单的模拟，实际应用中需要替换为真正的 AI 模型调用或更复杂的指令解析逻辑。

贡献

欢迎提出问题、报告错误或提交合并请求。

许可证

本项目采用 MIT 许可证。

Dev Tools Supporting MCP

The following are the main code editors that support the Model Context Protocol. Click the link to visit the official website for more information.

Zed: High-performance collaborative code editor, supports MCP protocol, providing a smooth programming experience. zed.dev

Cursor: AI code editor built on VS Code, supports MCP protocol for context-aware programming. cursor.com

Windsurf: AI code editor from Codeium, integrates MCP protocol to provide intelligent code assistance. windsurf.com

Continue: Open-source AI programming assistant plugin, supports VS Code and JetBrains, compatible with MCP protocol. continue.dev

Trae: AI-driven code editor, supports MCP protocol, focusing on enhancing developer programming experience. trae.ai

View More MCP Dev Tools

Tools

No tools

Comments

Recommend MCP Servers

Tavily MCP Server The Tavily MCP server provides: search, extract, map, crawl tools Real-time web search capabilities through the tavily-search tool Intelligent data extraction from web pages via the tavily-extract tool Powerful web mapping tool that creates a structured map of website Web crawler that systematically explores websites.

MCP Server Chart This is a TypeScript-based MCP server that provides chart generation capabilities. It allows you to create various types of charts through MCP tools. You can also use it in Dify.

GitHub MCP Server MCP Server for the GitHub API, enabling file operations, repository management, search functionality, and more.

Brave Search MCP Server Web and local search using Brave's Search API

Firecrawl MCP Server Advanced web scraping with JavaScript rendering, PDF support, and smart rate limiting

Context7 MCP LLMs rely on outdated or generic information about the libraries you use. You get:

Slack MCP server Channel management and messaging capabilities

Sequential Thinking MCP Server Dynamic and reflective problem-solving through thought sequences

Fetch MCP Server A Model Context Protocol server that provides web content fetching capabilities.

Playwright MCP A Model Context Protocol (MCP) server that provides browser automation capabilities using [Playwright](https://playwright.dev). This server enables LLMs to interact with web pages through structured accessibility snapshots, bypassing the need for screenshots or visually-tuned models.

View All MCP Servers