MCP ExplorerExplorer

Mcp Screenshot

@kazuphon 9 months ago
13 MIT
FreeCommunity
AI Systems
Repository for MCP screenshot functionality

Overview

What is Mcp Screenshot

mcp-screenshot is an MCP server designed to capture screenshots and perform OCR (Optical Character Recognition) text recognition from those screenshots.

Use cases

Use cases include automated testing where screenshots are needed for verification, extracting text from user interfaces for documentation purposes, and processing images for data entry tasks.

How to use

To use mcp-screenshot, install it via the command npx -y @kazuph/mcp-screenshot, and configure your claude_desktop_config.json to include the server settings. You can then instruct Claude to take a screenshot and recognize text, specifying the desired area and output format.

Key features

Key features include the ability to capture screenshots of the left half, right half, or full screen, perform OCR text recognition in Japanese and English, and output results in multiple formats such as JSON, Markdown, vertical, and horizontal.

Where to use

mcp-screenshot can be used in various fields including software development, quality assurance, and any application requiring text extraction from images, especially in environments where Japanese text is prevalent.

Content

MCP Screenshot

An MCP server that captures screenshots and performs OCR text recognition.

mcp-screenshot MCP server

Features

  • Screenshot capture (left half, right half, full screen)
  • OCR text recognition (supports Japanese and English)
  • Multiple output formats (JSON, Markdown, vertical, horizontal)

OCR Engines

This server uses two OCR engines:

  1. yomitoku

    • Primary OCR engine
    • High-accuracy Japanese text recognition
    • Runs as an API server
  2. Tesseract.js

    • Fallback OCR engine
    • Used when yomitoku is unavailable
    • Supports both Japanese and English recognition

Installation

npx -y @kazuph/mcp-screenshot

Claude Desktop Configuration

Add the following configuration to your claude_desktop_config.json:

Environment Variables

Variable Name Description Default Value
OCR_API_URL yomitoku API base URL http://localhost:8000

Usage Example

You can use it by instructing Claude like this:

Please take a screenshot of the left half of the screen and recognize the text in it.

Tool Specification

capture

Takes a screenshot and performs OCR.

Options:

  • region: Screenshot area (‘left’/‘right’/‘full’, default: ‘left’)
  • format: Output format (‘json’/‘markdown’/‘vertical’/‘horizontal’, default: ‘markdown’)

License

MIT

Author

kazuph

Tools

No tools

Comments

Recommend MCP Servers

View All MCP Servers