Ios Interact Mcp

@AFDudleyon 20 days ago

1 AGPL-3.0

FreeCommunity

AI Systems

Control iOS simulators with OCR and MCP for app actions and screenshots.

What is Ios Interact Mcp

ios-interact-mcp is a server that allows control of iOS simulators using the Model Context Protocol (MCP). It facilitates interaction with the simulator’s UI elements and applications programmatically.

Use cases

Use cases include automated UI testing, simulating user interactions for testing purposes, capturing screenshots for documentation, and integrating with development tools for enhanced workflow.

How to use

To use ios-interact-mcp, install it via pip or from source, configure it with Claude Desktop or run it standalone. You can interact with the server through command-line commands or connect it to Claude Code for enhanced functionality.

Key features

Key features include click actions on UI elements using OCR, app control for launching and terminating applications, screenshot capturing, text finding and interaction, deep linking to URLs, hardware button simulation, and window management.

Where to use

ios-interact-mcp is primarily used in mobile application development and testing environments, especially for automating interactions with iOS applications in simulators.

Clients Supporting MCP

The following are the main client software that supports the Model Context Protocol. Click the link to visit the official website for more information.

Claude Desktop: Official desktop application from Anthropic, natively supports MCP protocol. claude.ai

Cherry Studio: Cross-platform desktop client supporting multiple LLM providers, built-in MCP server support. cherry-ai.com

LobeChat: Modern open-source ChatGPT/LLMs UI, supports MCP protocol integration. lobehub.com

DeepChat: Cross-platform desktop AI assistant, compatible with MCP protocol, focusing on privacy and efficiency. deepchat.thinkinai.xyz

5ire: Cross-platform open-source desktop intelligent assistant MCP client, supports local knowledge base and MCP server. 5ire.app

View More MCP Clients

Overview

What is Ios Interact Mcp

Use cases

Use cases include automated UI testing, simulating user interactions for testing purposes, capturing screenshots for documentation, and integrating with development tools for enhanced workflow.

How to use

Key features

Where to use

ios-interact-mcp is primarily used in mobile application development and testing environments, especially for automating interactions with iOS applications in simulators.

Clients Supporting MCP

The following are the main client software that supports the Model Context Protocol. Click the link to visit the official website for more information.

Claude Desktop: Official desktop application from Anthropic, natively supports MCP protocol. claude.ai

Cherry Studio: Cross-platform desktop client supporting multiple LLM providers, built-in MCP server support. cherry-ai.com

LobeChat: Modern open-source ChatGPT/LLMs UI, supports MCP protocol integration. lobehub.com

DeepChat: Cross-platform desktop AI assistant, compatible with MCP protocol, focusing on privacy and efficiency. deepchat.thinkinai.xyz

5ire: Cross-platform open-source desktop intelligent assistant MCP client, supports local knowledge base and MCP server. 5ire.app

View More MCP Clients

Content

iOS Interact MCP Server

Control iOS simulators through the Model Context Protocol (MCP).

NOTE: This is AI Slop, there are dragons all over the place. I think the majority of the tests are fake or otherwise nonsensical, I have manually tested everything and the only bug I’ve found is in the complex gestures Claude made.

Features

Click Actions: Click on UI elements by text or coordinates using OCR
App Control: Launch and terminate iOS applications
Screenshots: Capture simulator screenshots with OCR support
Text Finding: Find and interact with text elements using OCR
Deep Linking: Open URLs in the simulator
Hardware Buttons: Simulate hardware button presses
Window Management: List and control simulator windows

Requirements

macOS with Xcode installed
Python 3.10 or higher
iOS Simulator
MCP-compatible client (e.g., Claude Desktop)

Installation

Via pip

pip install ios-interact-mcp

From source

git clone https://github.com/AFDudley/ios-interact-mcp.git
cd ios-interact-mcp
pip install -e .

Configuration

Claude Desktop

Add to your Claude Desktop configuration (~/Library/Application Support/Claude/claude_desktop_config.json):

{
  "mcpServers": {
    "ios-interact": {
      "command": "ios-interact-mcp"
    }
  }
}

Standalone Usage

# Run with stdio transport (default)
ios-interact-mcp

# Run with SSE transport for debugging
ios-interact-mcp --transport sse

Using with Claude Code

To use the MCP server with Claude Code, you need to start it with SSE transport and then connect Claude Code to it:

Start the SSE server:

# Start the server on port 8000 (default)
ios-interact-mcp --transport sse

# Or specify a custom port
ios-interact-mcp --transport sse --port 37849

Connect Claude Code to the server:

# Add the MCP server to Claude Code
claude mcp add -t sse ios-interact http://localhost:8000/sse

# Or if using a custom port
claude mcp add -t sse ios-interact http://localhost:37849/sse

Verify the connection:

# List configured MCP servers
claude mcp list

# Get details about the ios-interact server
claude mcp get ios-interact

Remove the server (when done):

claude mcp remove ios-interact -s local

Available Tools

click_text

Click on text found in the simulator using OCR.

click_text(text: string, occurrence?: number, simulator_name?: string)

click_at_coordinates

Click at specific screen coordinates.

click_at_coordinates(x: number, y: number, coordinate_space?: "screen")

launch_app

Launch an iOS application.

launch_app(bundle_id: string)

terminate_app

Terminate a running iOS application.

terminate_app(bundle_id: string)

screenshot

Take a screenshot of the simulator.

screenshot(filename?: string, return_path?: boolean)

find_text_in_simulator

Find text elements in the simulator using OCR.

find_text_in_simulator(search_text?: string, simulator_name?: string)

list_apps

List all installed applications.

list_apps()

open_url

Open a URL in the simulator (for deep linking).

open_url(url: string)

press_button

Press a hardware button.

press_button(button_name: "home" | "lock" | "volume_up" | "volume_down")

list_simulator_windows

List all simulator windows with their positions and sizes.

list_simulator_windows()

Usage Examples

Basic Automation

# Click on Settings app
await click_text("Settings")

# Navigate to General
await click_text("General")

# Take a screenshot
await screenshot("general_settings.png")

App Testing

# Launch your app
await launch_app("com.yourcompany.yourapp")

# Click on UI elements
await click_text("Login")

# Enter deep link
await open_url("yourapp://profile")

# Capture state
await screenshot("profile_screen.png")

Permissions

For OCR functionality to work properly, you need to grant accessibility permissions:

Go to System Preferences > Security & Privacy > Accessibility
Add Terminal (or your IDE) to the allowed applications
Restart the application if needed

Development

Setup Development Environment

# Clone the repository
git clone https://github.com/AFDudley/ios-interact-mcp.git
cd ios-interact-mcp

# Install in development mode with dev dependencies
pip install -e ".[dev]"

# Install pre-commit hooks
pre-commit install

Running Tests

# Run all tests
python -m pytest tests/

# Run specific test
python -m pytest tests/test_ocr_controller.py

Code Quality

The project uses:

Black for code formatting
Flake8 for linting
Pyright for type checking

These are automatically run on commit via pre-commit hooks.

Troubleshooting

OCR Not Working

Ensure you have granted accessibility permissions to Terminal/your IDE
Check that the simulator window is visible and not minimized
Verify ocrmac is installed: pip install ocrmac

Click Actions Failing

Verify the simulator is in focus
Ensure the target text is visible on screen
Try using find_text_in_simulator first to verify OCR is working

“No booted devices” Error

Make sure iOS Simulator is running:

open -a Simulator

Permission Errors

Grant necessary permissions in System Preferences > Security & Privacy > Accessibility

Contributing

Fork the repository
Create a feature branch (git checkout -b feature/amazing-feature)
Commit your changes (git commit -m 'Add amazing feature')
Push to the branch (git push origin feature/amazing-feature)
Open a Pull Request

License

This project is licensed under the MIT License - see the LICENSE file for details.

Acknowledgments

Built on the Model Context Protocol
Uses ocrmac for OCR functionality
Powered by Apple’s Vision framework and xcrun tools

Dev Tools Supporting MCP

The following are the main code editors that support the Model Context Protocol. Click the link to visit the official website for more information.

Zed: High-performance collaborative code editor, supports MCP protocol, providing a smooth programming experience. zed.dev

Cursor: AI code editor built on VS Code, supports MCP protocol for context-aware programming. cursor.com

Windsurf: AI code editor from Codeium, integrates MCP protocol to provide intelligent code assistance. windsurf.com

Continue: Open-source AI programming assistant plugin, supports VS Code and JetBrains, compatible with MCP protocol. continue.dev

Trae: AI-driven code editor, supports MCP protocol, focusing on enhancing developer programming experience. trae.ai

View More MCP Dev Tools

Tools

No tools

Comments

Recommend MCP Servers

MCP Server Chart This is a TypeScript-based MCP server that provides chart generation capabilities. It allows you to create various types of charts through MCP tools. You can also use it in Dify.

GitHub MCP Server MCP Server for the GitHub API, enabling file operations, repository management, search functionality, and more.

Brave Search MCP Server Web and local search using Brave's Search API

Firecrawl MCP Server Advanced web scraping with JavaScript rendering, PDF support, and smart rate limiting

Context7 MCP LLMs rely on outdated or generic information about the libraries you use. You get:

Slack MCP server Channel management and messaging capabilities

Sequential Thinking MCP Server Dynamic and reflective problem-solving through thought sequences

Fetch MCP Server A Model Context Protocol server that provides web content fetching capabilities.

Playwright MCP A Model Context Protocol (MCP) server that provides browser automation capabilities using [Playwright](https://playwright.dev). This server enables LLMs to interact with web pages through structured accessibility snapshots, bypassing the need for screenshots or visually-tuned models.

AMap MCP Server Amap Maps is a server that supports any MCP protocol client, allowing users to easily utilize the Amap Maps MCP server for various location-based services.

View All MCP Servers