MCP ExplorerExplorer

Unstructured Mcp

@MKhalusovaon 10 months ago
6 MIT
FreeCommunity
AI Systems
A Model Context Protocol server for unstructured document processing.

Overview

What is Unstructured Mcp

unstructured-mcp is a Model Context Protocol server designed for processing unstructured documents. It allows Large Language Models (LLMs) to extract and utilize content from various unstructured document formats.

Use cases

Use cases for unstructured-mcp include automating data extraction from reports, processing academic papers for research, converting scanned documents into editable formats, and enhancing customer support by analyzing unstructured feedback.

How to use

To use unstructured-mcp, clone the repository, set up the UV environment, create a .env file with your Unstructured API key, run the MCP server using ‘uv run doc_processor.py’, configure the Claude Desktop with the necessary settings, and restart Claude Desktop to access the MCP.

Key features

Key features of unstructured-mcp include support for a wide range of file types (such as .pdf, .docx, .csv, etc.), the ability to extract content from unstructured documents, and integration with Claude Desktop for enhanced document processing capabilities.

Where to use

unstructured-mcp can be used in various fields including data analysis, document management, academic research, and any area that requires processing and extracting information from unstructured documents.

Content

A Model Context Protocol server that provides unstructured document processing capabilities.
This server enables LLMs to extract and use content from an unstructured document.

This repo is work in progress, proceed with caution :)

Supported file types:

{".abw", ".bmp", ".csv", ".cwk", ".dbf", ".dif", ".doc", ".docm", ".docx", ".dot",
 ".dotm", ".eml", ".epub", ".et", ".eth", ".fods", ".gif", ".heic", ".htm", ".html",
 ".hwp", ".jpeg", ".jpg", ".md", ".mcw", ".mw", ".odt", ".org", ".p7s", ".pages",
 ".pbd", ".pdf", ".png", ".pot", ".potm", ".ppt", ".pptm", ".pptx", ".prn", ".rst",
 ".rtf", ".sdp", ".sgl", ".svg", ".sxg", ".tiff", ".txt", ".tsv", ".uof", ".uos1",
 ".uos2", ".web", ".webp", ".wk2", ".xls", ".xlsb", ".xlsm", ".xlsx", ".xlw", ".xml",
 ".zabw"}

Prerequisites:
You’ll need:

Quick TLDR on how to add this MCP to your Claude Desktop:

  1. Clone the repo and set up the UV environment.
  2. Create a .env file in the root directory and add the following env variable: UNSTRUCTURED_API_KEY.
  3. Run the MCP server: uv run doc_processor.py
  4. Go to ~/Library/Application Support/Claude/ and create a claude_desktop_config.json. In that file add:
{
    "mcpServers": {
        "unstructured_doc_processor": {
            "command": "PATH/TO/YOUR/UV",
            "args": [
                "--directory",
                "ABSOLUTE/PATH/TO/YOUR/unstructured-mcp/",
                "run",
                "doc_processor.py"
            ],
            "disabled": false
        }
    }
}
  1. Restart Claude Desktop. You should now be able to use the MCP.

Tools

No tools

Comments

Recommend MCP Servers

View All MCP Servers