MCP ExplorerExplorer

Macos Ocr Mcp

@whiteking64on 15 days ago
1 MIT
FreeCommunity
AI Systems
A macOS tool for OCR using Vision framework, returning text and confidence scores.

Overview

What is Macos Ocr Mcp

macos-ocr-mcp is a MetaCall Protocol (MCP) tool designed for Optical Character Recognition (OCR) on images using macOS’s built-in Vision framework. It provides an ocr_image tool that processes image files to extract recognized text, confidence scores, and bounding boxes.

Use cases

Use cases for macos-ocr-mcp include scanning printed documents to convert them into editable text, extracting text from images for data analysis, and developing applications that require image-based text recognition.

How to use

To use macos-ocr-mcp, first set up a virtual environment and install the required dependencies. Then, run the MCP server by executing uv run main.py. You can use the ocr_image tool by providing the path to an image file, which will return the recognized text and associated data.

Key features

Key features include the ability to perform OCR on images using macOS’s native capabilities, returning recognized text with confidence scores and bounding box coordinates, and easy integration with the MetaCall Protocol.

Where to use

macos-ocr-mcp can be used in various fields such as document digitization, automated data entry, accessibility tools for visually impaired users, and any application requiring text extraction from images.

Content

macOS OCR MCP Tool

This project provides a MetaCall Protocol (MCP) tool to perform Optical Character Recognition (OCR) on images using macOS’s built-in Vision framework. It exposes an ocr_image tool that takes an image file path and returns the recognized text along with confidence scores and bounding boxes.

Project Setup

Dependencies

This project relies on Python 3.13+ and the following main dependencies:

  • ocrmac: For accessing macOS OCR capabilities. See ocrmac.
  • Pillow: For image manipulation.
  • mcp[cli]>=1.7.1: For the MetaCall Protocol server and client.

Installation

It is recommended to use a virtual environment.

  1. Create and activate a virtual environment:

    python -m venv .venv
    source .venv/bin/activate
    
  2. Install dependencies using uv:

    uv sync
    

Running the MCP Server

To start the MCP server, run main.py:

uv run main.py

This will start the MCP server, making the ocr_image tool available.

Available MCP Tools

ocr_image

  • Description: Conducts OCR on the provided image file using macOS’s built-in capabilities. Returns recognized text segments, their confidence scores, and bounding box coordinates.
  • Input: file_path: str - The absolute or relative path to the image file.
  • Output (Example Success):
  • Output (Example Error):
    {
      "error": "OCR functionality is only available on macOS."
    }
    or
    {
      "error": "File not found: path/to/nonexistent/image.png"
    }

Note: This tool will only function correctly on a macOS system due to its reliance on the Vision framework.

Testing with MCP Inspector

You can use the MCP Inspector to connect to the running MCP server and test the tool.

Cursor MCP Configuration

To configure this MCP server in Cursor, you can add the following to your MCP JSON configuration file (e.g., ~/.cursor/mcp.json or project-specific .cursor/mcp.json):

{
  "mcpServers": {
    "ocrmac": {
      "command": "uv",
      "args": [
        "--directory",
        "/path/to/macos-ocr-mcp",
        "run",
        "main.py"
      ]
    }
  }
}

This configuration tells Cursor how to start your MCP server. You can then call the ocrmac.ocr_image tool from within Cursor.

Tools

No tools

Comments