MCP ExplorerExplorer

Ocrtool Mcp

@ihugangon a year ago
14 NOASSERTION
FreeCommunity
AI Systems
MCP OCR module using macOS Vision framework

Overview

What is Ocrtool Mcp

ocrtool-mcp is an open-source macOS-native OCR module built with Swift and the Vision framework, designed to comply with the Model Context Protocol (MCP). It allows LLM tools to invoke OCR functionalities via JSON-RPC.

Use cases

Use cases include extracting text from scanned documents, reading text from images in mobile applications, and integrating OCR capabilities into custom software solutions.

How to use

To use ocrtool-mcp, clone the repository, build it using Swift, and run it as an MCP module. Send a JSON-RPC request via stdin with parameters such as image path and language.

Key features

Key features include accurate OCR powered by macOS Vision Framework, recognition of both Chinese and English text, MCP-compatible JSON-RPC interface, line-wise OCR results with bounding boxes, lightweight and fast performance, and it is open-source and free.

Where to use

ocrtool-mcp can be used in various fields such as document digitization, automated data entry, accessibility tools, and any application requiring text recognition from images.

Content

ocrtool-mcp

🇨🇳 中文文档

ocrtool-mcp is an open-source macOS-native OCR module built with Swift and Vision framework, designed to comply with the Model Context Protocol (MCP). It can be invoked by LLM tools like Cursor, Continue, OpenDevin, or custom agents using JSON-RPC over stdin.

ocrtool-mcp is a macOS-native OCR tool that implements the stdin-based MCP module protocol, allowing LLM tools like Cursor or Continue to call it via JSON-RPC.

platform
language
mcp
license


✨ Features

  • ✅ Accurate OCR powered by macOS Vision Framework
  • ✅ Recognizes both Chinese and English text
  • ✅ MCP-compatible JSON-RPC interface
  • ✅ Returns line-wise OCR results with bounding boxes (in pixels)
  • ✅ Lightweight, fast, and fully offline
  • ✅ Open source free software

🚀 Quick Start

git clone https://github.com/ihugang/ocrtool-mcp.git
cd ocrtool-mcp
swift build -c release

Run as MCP Module:

.build/release/ocrtool-mcp

Send a JSON-RPC request via stdin:

{
  "jsonrpc": "2.0",
  "id": "1",
  "method": "ocr_text",
  "params": {
    "image_path": "test.jpg",
    "lang": "zh+en",
    "enhanced": true
  }
}

Expected output:

{
  "jsonrpc": "2.0",
  "id": "1",
  "result": {
    "lines": [
      {
        "text": "你好",
        "bbox": {
          "x": 120,
          "y": 200,
          "width": 300,
          "height": 20
        }
      },
      {
        "text": "Hello",
        "bbox": {
          "x": 122,
          "y": 240,
          "width": 290,
          "height": 20
        }
      }
    ]
  }
}

📁 Project Structure

.
├── Package.swift
├── Sources/OCRToolMCP/main.swift
├── .mcp/
│   ├── config.json
│   └── schema/ocr_text.json
├── README.md
├── LICENSE
└── .gitignore

📘 MCP Integration

You can use this module with:

  • Continue
  • Cursor
  • Any custom LLM agent that supports MCP stdin/stdout JSON-RPC

🛠 Cursor Configuration

To use this module in Cursor, add the following to your cursor.json file:

{
  "mcpServers": {
    "ocrtool-mcp": {
      "command": "Full path ... /ocrtool-mcp"
    }
  }
}

👨‍💻 Author

📝 License

MIT License

Tools

No tools

Comments

Recommend MCP Servers

View All MCP Servers