Mcp Robot

5 MIT

FreeCommunity

AI Systems

# MCP Server Operating Robotic Arm

What is Mcp Robot

mcp_robot is an intelligent robotic arm control system based on the MCP server, designed to enable external agents, such as large language models, to control a six-joint robotic arm through standardized interfaces.

Use cases

Use cases include educational demonstrations of robotic control, rapid prototyping of robotic tasks, and research into human-robot interaction using AI-driven commands.

How to use

Users can interact with the mcp_robot system by sending natural language commands via a Python client or through a web interface. The system translates these commands into specific actions for the robotic arm, which can be simulated or executed on real hardware.

Key features

Key features include a high-fidelity 3D simulator, multi-mode control (keyboard, real robotic arm, MCP service), and integration with Agentic AI for natural language processing and command execution.

Where to use

mcp_robot can be used in various fields such as robotics research, education, automation, and any domain requiring human-robot interaction through natural language.

Clients Supporting MCP

The following are the main client software that supports the Model Context Protocol. Click the link to visit the official website for more information.

Claude Desktop: Official desktop application from Anthropic, natively supports MCP protocol. claude.ai

Cherry Studio: Cross-platform desktop client supporting multiple LLM providers, built-in MCP server support. cherry-ai.com

LobeChat: Modern open-source ChatGPT/LLMs UI, supports MCP protocol integration. lobehub.com

DeepChat: Cross-platform desktop AI assistant, compatible with MCP protocol, focusing on privacy and efficiency. deepchat.thinkinai.xyz

5ire: Cross-platform open-source desktop intelligent assistant MCP client, supports local knowledge base and MCP server. 5ire.app

View More MCP Clients

Overview

What is Mcp Robot

Use cases

Use cases include educational demonstrations of robotic control, rapid prototyping of robotic tasks, and research into human-robot interaction using AI-driven commands.

How to use

Key features

Key features include a high-fidelity 3D simulator, multi-mode control (keyboard, real robotic arm, MCP service), and integration with Agentic AI for natural language processing and command execution.

Where to use

mcp_robot can be used in various fields such as robotics research, education, automation, and any domain requiring human-robot interaction through natural language.

Clients Supporting MCP

The following are the main client software that supports the Model Context Protocol. Click the link to visit the official website for more information.

Claude Desktop: Official desktop application from Anthropic, natively supports MCP protocol. claude.ai

Cherry Studio: Cross-platform desktop client supporting multiple LLM providers, built-in MCP server support. cherry-ai.com

LobeChat: Modern open-source ChatGPT/LLMs UI, supports MCP protocol integration. lobehub.com

DeepChat: Cross-platform desktop AI assistant, compatible with MCP protocol, focusing on privacy and efficiency. deepchat.thinkinai.xyz

5ire: Cross-platform open-source desktop intelligent assistant MCP client, supports local knowledge base and MCP server. 5ire.app

View More MCP Clients

Content

基于MCP的智能机械臂控制系统

一、项目概述

项目名称： 基于 Agentic MCP的智能机械臂控制系统
项目目标：
1. 构建一个六关节机械臂包含虚拟仿真和事物。
2. 搭建一个 MCP server，允许外部智能体（各种大语言模型 Agent）通过标准化接口控制机械臂。
3. 大语言模型能够理解用户自然语言指令，自主编排，并将其转化为对机械臂的具体操作序列。
项目背景：
随着机器人技术和人工智能的飞速发展，人机交互的自然性和智能化程度成为关键瓶颈。传统的机器人控制方式（如编程、示教器）专业性强、效率较低。大语言模型在理解复杂指令和规划任务方面展现出巨大潜力。本项目通过6关节机械臂的案例，探索mcp在嵌入式硬件上的使用。
解决的实际问题：
1. 降低机器人操作门槛： 使非专业用户也能通过自然语言与机械臂交互，完成复杂任务。
2. 提高机器人任务编程效率： 利用LLM的规划能力，快速生成和调整机器人作业流程。
3. 提供安全、低成本的研发与调试环境： 通过浏览器可用的模拟器，在低成本不接触真实硬件的情况下验证控制算法和AI逻辑。
4. 探索LLM在具身智能领域的应用： 为LLM赋予“手”，使其能与物理世界进行更直接的交互。

二、作品描述与功能亮点

我们的作品是一个集成了三维模拟、真实硬件控制、MCP服务和AI智能体交互的综合性机械臂控制平台。

核心功能：
1. 高保真三维模拟器：
  - 基于 Three.js 和 URDFLoader，能够加载和渲染标准的URDF机械臂模型。
  - 提供逼真的光照、阴影和可定制的地面纹理（MuJoCo风格）。
  - 支持用户通过滑块控制器自由控制并观察机械臂。
2. 多模式控制：
  - 键盘控制： 允许用户通过键盘按键实时控制模拟机械臂的各个关节，并显示按键状态。
  - 真实机械臂控制： 集成真实舵机，通过 Web Serial API 连接并控制真实六轴机械臂，实现与模拟器的同步运动。
  - MCP服务控制： 通过WebSocket实现的MCP桥接服务，允许外部程序（自己做的mcp Python客户端， VSCode AI Toolkit）发送指令控制机械臂关节角度。
3. Agentic AI 交互 (通过Python客户端实现)：
  - 大语言模型（如DeepSeek，chatgpt）使用“mcp tools”，将其理解的用户自然语言指令（“让机械臂跳个海藻舞”、“点点头吧”）转化为对预定义机器人控制工具的调用。
  - Python客户端接收LLM的工具调用请求，将其转换为MCP兼容的JSON命令，发送给MCP桥接服务器，进而控制模拟器或真实机械臂。
  - 机器人操作的结果（成功、失败、警告）会反馈给LLM，使其能够进行后续的决策或向用户报告。
亮点与特点：
1. 端到端智能控制链路： 实现了从自然语言输入 -> LLM理解与规划 -> MCP服务 -> 模拟器/真实机械臂执行 -> 结果反馈回LLM 的完整闭环。
2. 虚实结合与低成本验证： 模拟器为AI算法的开发和测试提供了安全、高效的环境，并能无缝迁移到对真实硬件的控制。
3. 用户友好的交互界面： 提供清晰的控制面板、状态显示和即时反馈，提升用户体验。
4. 模块化与可扩展性： 系统各组件（模拟器、控制器、MCP服务、AI客户端）相对独立，便于未来功能扩展和技术升级。

三、MCP服务与客户端的构建

服务端
mcp = FastMCP()
mcp客户端

四、Agentic AI 平台框架与智能体构建

定义工具 (mcp Tools for LLM)：
1. 我们为LLM预定义了一系列与机器人控制相关的“工具”：
  - set_robot_servo_angle: 通过ID和角度控制单个舵机。
  - set_robot_joint_angle: 通过URDF关节名和角度控制单个关节。
  - set_robot_all_servo_angles: 同时控制多个舵机。
2. 每个工具描述都包含了名称、功能说明、以及详细的参数定义（类型、描述、是否必需）。这使得LLM能够理解每个工具的用途和如何正确调用它。
LLM交互流程 (在Python客户端中实现)：
1. 用户指令输入： 用户向Python客户端输入自然语言指令（例如“让机械臂的第一个关节向上转30度并重复两次这个摇摆动作”）。
2. 调用LLM API： Python脚本将用户的指令以及预定义的机器人控制工具列表发送给DeepSeek API。我们设置 tool_choice=“auto”，允许LLM自行判断何时以及如何使用这些工具。
3. LLM生成工具调用： 如果LLM认为需要操作机器人来完成用户指令，它会在API响应中返回一个或多个 tool_calls 对象。每个 tool_call 包含要调用的函数名（我们定义的工具名）和参数（由LLM根据用户指令生成的JSON字符串）。
4. Python执行工具调用：
  - Python脚本解析 tool_call，获取函数名和参数。
  - 调用 execute_robot_tool_call 函数，该函数将LLM的抽象工具调用转换为具体的MCP命令JSON。
5. 获取机器人操作结果： Python脚本等待从MCP桥接服务器返回的操作回执（表示成功、失败或警告）。
6. 将结果反馈给LLM： Python脚本将机器人操作的结果（格式化为JSON字符串）作为 role: “tool” 的消息，连同之前的聊天历史，再次发送给DeepSeek API。
7. LLM生成最终回复： LLM在收到工具执行结果后，会生成一个对用户指令的最终回复，例如确认操作完成、报告错误或请求下一步指令。
Agentic特性：
- 感知-思考-行动循环： LLM接收用户输入（感知），通过工具调用进行规划（思考），Python脚本执行工具调用并操作机器人（行动），机器人操作结果反馈给LLM（再次感知），形成闭环。
- 多步推理与复杂任务分解： 对于如“重复三次摇摆动作”这样的指令，LLM能够理解并连续生成多个工具调用来实现。
- **多mcp服务器的支持： **可以添加例如天气的mcp服务器，这样就可以问机械臂，今天深圳下雨吗，下雨的话你就点点头，LLM可以自动调用多个mcp服务器，基于流程判断。
- 与环境交互： 虽然目前主要是单向控制，但反馈机制为未来实现更复杂的双向交互（如基于视觉的调整）打下了基础。

通过这种方式，我们成功地将LLM的自然语言理解和规划能力与机械臂的物理执行能力结合起来，构建了一个初步的智能体雏形。

五、技术创新点

基于WebSocket的轻量级MCP桥接服务： MCP服务器最终是要控制真实的硬件的，服务器是可以部署到线上，但是跟硬件的沟通的控制实现还是在端侧，我们没有采用复杂的RPC框架或重量级的消息队列，而是设计了一个简洁高效的WebSocket桥接服务，实现了Python AI客户端与浏览器端Three.js模拟器之间的低延迟、双向通信。这使得命令传递和状态反馈快速直接。
动态Blob消息处理： 在实现浏览器端WebSocket消息接收时，我们发现即使服务器发送的是文本帧，浏览器有时也会将 event.data 识别为 Blob 对象。我们通过异步读取 Blob.text() 内容，确保了消息的正确解析，增强了通信的鲁棒性。
虚实同步与错误恢复机制（针对真实舵机）：
- 在连接真实舵机时，系统不仅将指令同步到硬件，还会尝试从舵机读取初始位置，并将其作为后续控制的基准。
- 实现了舵机通信状态的实时UI反馈（idle, pending, success, warning, error）。
- 当舵机操作失败或发生错误时，系统会记录最后一个安全位置，并尝试将舵机恢复到该位置，增强了物理操作的安全性。同时，错误信息会通过MCP服务反馈给AI客户端。

六、UI/UX优化

响应式控制面板： 采用固定定位和最大高度限制，确保在不同屏幕尺寸下均能良好显示，并通过 overflow: auto 实现内容溢出时的滚动。隐藏了原生滚动条，使界面更简洁。
可折叠区域 (Collapsible Sections)： 将控制面板的不同功能模块（键盘控制、真实机器人、MCP服务）组织在可折叠的区域内，用户可以按需展开或收起，保持界面整洁有序。图标（▼/►）直观指示展开状态。
实时状态反馈：
- 键盘按键按下时有视觉高亮 (key-pressed 类) 和控制区高亮 (control-active 类)。
- 舵机连接状态、每个舵机的通信状态（idle, pending, success, warning, error）以及具体的错误信息都实时显示在UI上，并用不同颜色区分。
即时警告提醒： 对于虚拟关节超限或真实舵机操作失败/错误，屏幕顶部会弹出非阻塞式的、颜色鲜明的警告框 (jointLimitAlert, servoLimitAlert)，并在几秒后自动消失。
清晰的帮助提示： 在关键操作按钮（如“连接真实机械臂”）旁添加了帮助图标，鼠标悬浮可显示操作说明和注意事项。
平滑的机械臂动画： 如前所述，关节运动采用插值和缓动函数，提供了流畅的视觉体验。
一致的按钮风格： 对连接按钮等交互元素采用了统一的视觉风格，并根据连接状态改变颜色。

七、团队贡献

[张泽华] - 项目负责人/架构设计/后端开发：
[唐杨] - 前端开发/Three.js模拟器：
[肖琦] - AI集成/Python客户端开发：
[肖凯骏，李印东] - 真实机械臂集成/测试与文档：

前端代码部分参考开源项目： https://github.com/timqian/bambot

八、未来TODO

本项目为构建更智能、更易用的机器人交互系统打下了坚实的基础。未来，我们计划从以下几个方面进行深化和拓展：

增强AI感知与交互能力：
- 集成视觉反馈： 引入摄像头和计算机视觉算法（或利用LLM的多模态能力），使机械臂能够“看到”环境，实现基于视觉的物体识别、定位和抓取任务，并能根据视觉反馈调整动作。
提升模拟器保真度与功能：
- 物理引擎集成： 引入Bullet、Ammo.js或Rapier等物理引擎，实现更真实的碰撞检测、重力、摩擦力等物理效果，支持更复杂的抓取和操作模拟。
优化MCP服务与多Agent协作：
- 更丰富的MCP指令集： 扩展MCP协议，支持查询机器人状态（如各关节当前角度、末端执行器位置）、设置速度/加速度、控制夹爪等更细致的操作。

Dev Tools Supporting MCP

The following are the main code editors that support the Model Context Protocol. Click the link to visit the official website for more information.

Zed: High-performance collaborative code editor, supports MCP protocol, providing a smooth programming experience. zed.dev

Cursor: AI code editor built on VS Code, supports MCP protocol for context-aware programming. cursor.com

Windsurf: AI code editor from Codeium, integrates MCP protocol to provide intelligent code assistance. windsurf.com

Continue: Open-source AI programming assistant plugin, supports VS Code and JetBrains, compatible with MCP protocol. continue.dev

Trae: AI-driven code editor, supports MCP protocol, focusing on enhancing developer programming experience. trae.ai

View More MCP Dev Tools

Tools

No tools

Comments

Recommend MCP Servers

Tavily MCP Server The Tavily MCP server provides: search, extract, map, crawl tools Real-time web search capabilities through the tavily-search tool Intelligent data extraction from web pages via the tavily-extract tool Powerful web mapping tool that creates a structured map of website Web crawler that systematically explores websites.

MCP Server Chart This is a TypeScript-based MCP server that provides chart generation capabilities. It allows you to create various types of charts through MCP tools. You can also use it in Dify.

GitHub MCP Server MCP Server for the GitHub API, enabling file operations, repository management, search functionality, and more.

Brave Search MCP Server Web and local search using Brave's Search API

Firecrawl MCP Server Advanced web scraping with JavaScript rendering, PDF support, and smart rate limiting

Context7 MCP LLMs rely on outdated or generic information about the libraries you use. You get:

Slack MCP server Channel management and messaging capabilities

Sequential Thinking MCP Server Dynamic and reflective problem-solving through thought sequences

Fetch MCP Server A Model Context Protocol server that provides web content fetching capabilities.

Playwright MCP A Model Context Protocol (MCP) server that provides browser automation capabilities using [Playwright](https://playwright.dev). This server enables LLMs to interact with web pages through structured accessibility snapshots, bypassing the need for screenshots or visually-tuned models.

View All MCP Servers