Pymcpautogui

6 MIT

FreeCommunity

AI Systems

# Graphical User Interface Operation MCP Server

What is Pymcpautogui

PyMCPAutoGUI is a GUI manipulation MCP server that allows AI agents to interact with the graphical user interface of a computer, enabling them to see the screen, control the mouse and keyboard, and manage windows like a human user.

Use cases

Use cases for PyMCPAutoGUI include automating data entry in applications, testing user interfaces, creating bots for desktop applications, and enhancing AI agents with GUI manipulation capabilities.

How to use

To use PyMCPAutoGUI, simply install it via pip, start the server with a command, and integrate it with your MCP-compatible client, such as the Cursor editor. It provides straightforward commands for GUI automation.

Key features

Key features include comprehensive control over GUI elements, screen perception for taking screenshots and locating images, window management capabilities, and user interaction through alert and prompt boxes.

Where to use

PyMCPAutoGUI can be used in various fields including software testing, automation of repetitive tasks, and development of AI assistants that require GUI interaction.

Clients Supporting MCP

The following are the main client software that supports the Model Context Protocol. Click the link to visit the official website for more information.

Claude Desktop: Official desktop application from Anthropic, natively supports MCP protocol. claude.ai

Cherry Studio: Cross-platform desktop client supporting multiple LLM providers, built-in MCP server support. cherry-ai.com

LobeChat: Modern open-source ChatGPT/LLMs UI, supports MCP protocol integration. lobehub.com

DeepChat: Cross-platform desktop AI assistant, compatible with MCP protocol, focusing on privacy and efficiency. deepchat.thinkinai.xyz

5ire: Cross-platform open-source desktop intelligent assistant MCP client, supports local knowledge base and MCP server. 5ire.app

View More MCP Clients

Overview

What is Pymcpautogui

Use cases

Use cases for PyMCPAutoGUI include automating data entry in applications, testing user interfaces, creating bots for desktop applications, and enhancing AI agents with GUI manipulation capabilities.

How to use

Key features

Where to use

PyMCPAutoGUI can be used in various fields including software testing, automation of repetitive tasks, and development of AI assistants that require GUI interaction.

Clients Supporting MCP

The following are the main client software that supports the Model Context Protocol. Click the link to visit the official website for more information.

Claude Desktop: Official desktop application from Anthropic, natively supports MCP protocol. claude.ai

Cherry Studio: Cross-platform desktop client supporting multiple LLM providers, built-in MCP server support. cherry-ai.com

LobeChat: Modern open-source ChatGPT/LLMs UI, supports MCP protocol integration. lobehub.com

DeepChat: Cross-platform desktop AI assistant, compatible with MCP protocol, focusing on privacy and efficiency. deepchat.thinkinai.xyz

5ire: Cross-platform open-source desktop intelligent assistant MCP client, supports local knowledge base and MCP server. 5ire.app

View More MCP Clients

Content

PyMCPAutoGUI 🖱️⌨️🖼️ - GUI Automation via MCP

Supercharge your AI Agent’s capabilities! ✨ PyMCPAutoGUI provides a bridge between your AI agents (like those in Cursor or other MCP-compatible environments) and your computer’s graphical user interface (GUI). It allows your agent to see the screen 👁️, control the mouse 🖱️ and keyboard ⌨️, and interact with windows 🪟, just like a human user!

Stop tedious manual GUI tasks and let your AI do the heavy lifting 💪. Perfect for automating repetitive actions, testing GUIs, or building powerful AI assistants 🤖.

🤔 Why Choose PyMCPAutoGUI?

🤖 Empower Your Agents: Give your AI agents the power to interact directly with desktop applications.
✅ Simple Integration: Works seamlessly with MCP-compatible clients like the Cursor editor. It’s plug and play!
🚀 Easy to Use: Get started with a simple server command. Seriously, it’s that easy.
🖱️⌨️ Comprehensive Control: Offers a wide range of GUI automation functions from the battle-tested PyAutoGUI and PyGetWindow.
🖼️ Screen Perception: Includes tools for taking screenshots and locating images on the screen – let your agent see!
🪟 Window Management: Control window position, size, state (minimize, maximize), and more. Tidy up that desktop!
💬 User Interaction: Display alert, confirmation, and prompt boxes to communicate with the user.

🛠️ Supported Environments

Operating Systems: Windows, macOS, Linux (Requires appropriate dependencies for pyautogui on each OS)
Python: 3.11+ 🐍
MCP Clients: Cursor Editor, any client supporting the Model Context Protocol (MCP)

🚀 Getting Started - It’s Super Easy!

1. Installation (Recommended: Use a Virtual Environment!)

Using a virtual environment keeps your project dependencies tidy.

# Create and activate a virtual environment (example using venv)
python -m venv .venv
# Windows PowerShell
.venv\Scripts\Activate.ps1
# macOS / Linux bash
source .venv/bin/activate

# Install using pip (from PyPI or local source)
# Make sure your virtual environment is active!
pip install pymcpautogui # Or pip install . if installing from local source

(Note: pyautogui might have system dependencies like scrot on Linux for screenshots. Please check the pyautogui documentation for OS-specific installation requirements.)

2. Running the MCP Server

Once installed, simply run the server from your terminal:

# Make sure your virtual environment is activated!
python -m pymcpautogui.server

The server will start and listen for connections (defaulting to port 6789). Look for this output:

INFO:     Started server process [XXXXX]
INFO:     Waiting for application startup.
INFO:     Application startup complete.
INFO:     Uvicorn running on http://127.0.0.1:6789 (Press CTRL+C to quit)

Keep this terminal running while you need the GUI automation magic! ✨

✨ Seamless Integration with Cursor Editor

Connect PyMCPAutoGUI to Cursor (@ symbol) for GUI automation directly within your coding workflow.

Open MCP Configuration: In Cursor, use the Command Palette (Ctrl+Shift+P or Cmd+Shift+P) and find “MCP: Open mcp.json configuration file”.
Add PyMCPAutoGUI Config: Add or merge this configuration into your mcp.json. Adjust paths if needed (especially if Cursor isn’t running from the project root).
(Tip: If mcp.json already exists, just add the "PyMCPAutoGUI": { ... } part inside the mcpServers object.)
Save mcp.json. Cursor will detect the server.
Automate! Use @PyMCPAutoGUI in Cursor chats:

Example:
@PyMCPAutoGUI move_to(x=100, y=200)
@PyMCPAutoGUI write(text='Automating with AI! 🎉', interval=0.1)
@PyMCPAutoGUI screenshot(filename='current_screen.png')
@PyMCPAutoGUI activate_window(title='Notepad')

🧰 Available Tools

PyMCPAutoGUI exposes most functions from pyautogui and pygetwindow. Examples include:

Mouse 🖱️: move_to, click, move_rel, drag_to, drag_rel, scroll, mouse_down, mouse_up, get_position
Keyboard ⌨️: write, press, key_down, key_up, hotkey
Screenshots 🖼️: screenshot, locate_on_screen, locate_center_on_screen
Windows 🪟: get_all_titles, get_windows_with_title, get_active_window, activate_window, minimize_window, maximize_window, restore_window, move_window, resize_window, close_window
Dialogs 💬: alert, confirm, prompt, password
Config ⚙️: set_pause, set_failsafe

For the full list and details, check the pymcpautogui/server.py file or use @PyMCPAutoGUI list_tools in your MCP client.

📄 License

This project is licensed under the MIT License - see the LICENSE file for details. Happy Automating! 😄

Dev Tools Supporting MCP

The following are the main code editors that support the Model Context Protocol. Click the link to visit the official website for more information.

Zed: High-performance collaborative code editor, supports MCP protocol, providing a smooth programming experience. zed.dev

Cursor: AI code editor built on VS Code, supports MCP protocol for context-aware programming. cursor.com

Windsurf: AI code editor from Codeium, integrates MCP protocol to provide intelligent code assistance. windsurf.com

Continue: Open-source AI programming assistant plugin, supports VS Code and JetBrains, compatible with MCP protocol. continue.dev

Trae: AI-driven code editor, supports MCP protocol, focusing on enhancing developer programming experience. trae.ai

View More MCP Dev Tools

Tools

No tools

Comments

Recommend MCP Servers

Tavily MCP Server The Tavily MCP server provides: search, extract, map, crawl tools Real-time web search capabilities through the tavily-search tool Intelligent data extraction from web pages via the tavily-extract tool Powerful web mapping tool that creates a structured map of website Web crawler that systematically explores websites.

MCP Server Chart This is a TypeScript-based MCP server that provides chart generation capabilities. It allows you to create various types of charts through MCP tools. You can also use it in Dify.

GitHub MCP Server MCP Server for the GitHub API, enabling file operations, repository management, search functionality, and more.

Brave Search MCP Server Web and local search using Brave's Search API

Firecrawl MCP Server Advanced web scraping with JavaScript rendering, PDF support, and smart rate limiting

Context7 MCP LLMs rely on outdated or generic information about the libraries you use. You get:

Slack MCP server Channel management and messaging capabilities

Sequential Thinking MCP Server Dynamic and reflective problem-solving through thought sequences

Fetch MCP Server A Model Context Protocol server that provides web content fetching capabilities.

Playwright MCP A Model Context Protocol (MCP) server that provides browser automation capabilities using [Playwright](https://playwright.dev). This server enables LLMs to interact with web pages through structured accessibility snapshots, bypassing the need for screenshots or visually-tuned models.

View All MCP Servers