Mcp Docs Rag

@kazuphon a year ago

5 MIT

FreeCommunity

AI Systems

MCP Server for RAG enables document querying using LLMs from local directories.

What is Mcp Docs Rag

mcp-docs-rag is a TypeScript-based MCP server that implements a Retrieval-Augmented Generation (RAG) system for querying documents stored in a local directory. It enables users to interact with documents using large language models (LLMs) by providing contextual information from local repositories and text files.

Use cases

Use cases for mcp-docs-rag include academic research where users need to query literature, software development environments where documentation needs to be accessed and queried, and any scenario where users require contextual information from local documents to enhance their productivity.

How to use

To use mcp-docs-rag, set up a local directory for storing documents, which defaults to ~/docs. You can then utilize various tools such as ‘list_documents’ to view available documents, ‘rag_query’ to query documents with context, ‘add_git_repository’ to clone Git repositories, and ‘add_text_file’ to download text files.

Key features

Key features of mcp-docs-rag include the ability to list and access documents via ‘docs://’ URIs, support for both Git repositories and text files, and tools for querying documents and managing document repositories. It also provides a guide for document usage and RAG functionality.

Where to use

mcp-docs-rag can be used in various fields that require document retrieval and contextual querying, such as research, education, content management, and software development, where accessing and querying documentation efficiently is essential.

Clients Supporting MCP

The following are the main client software that supports the Model Context Protocol. Click the link to visit the official website for more information.

Claude Desktop: Official desktop application from Anthropic, natively supports MCP protocol. claude.ai

Cherry Studio: Cross-platform desktop client supporting multiple LLM providers, built-in MCP server support. cherry-ai.com

LobeChat: Modern open-source ChatGPT/LLMs UI, supports MCP protocol integration. lobehub.com

DeepChat: Cross-platform desktop AI assistant, compatible with MCP protocol, focusing on privacy and efficiency. deepchat.thinkinai.xyz

5ire: Cross-platform open-source desktop intelligent assistant MCP client, supports local knowledge base and MCP server. 5ire.app

View More MCP Clients

Overview

What is Mcp Docs Rag

Use cases

How to use

Key features

Where to use

Clients Supporting MCP

The following are the main client software that supports the Model Context Protocol. Click the link to visit the official website for more information.

Claude Desktop: Official desktop application from Anthropic, natively supports MCP protocol. claude.ai

Cherry Studio: Cross-platform desktop client supporting multiple LLM providers, built-in MCP server support. cherry-ai.com

LobeChat: Modern open-source ChatGPT/LLMs UI, supports MCP protocol integration. lobehub.com

DeepChat: Cross-platform desktop AI assistant, compatible with MCP protocol, focusing on privacy and efficiency. deepchat.thinkinai.xyz

5ire: Cross-platform open-source desktop intelligent assistant MCP client, supports local knowledge base and MCP server. 5ire.app

View More MCP Clients

Content

mcp-docs-rag MCP Server

RAG (Retrieval-Augmented Generation) for documents in a local directory

This is a TypeScript-based MCP server that implements a RAG system for documents stored in a local directory. It allows users to query documents using LLMs with context from locally stored repositories and text files.

Features

Resources

List and access documents via docs:// URIs
Documents can be Git repositories or text files
Plain text mime type for content access

Tools

list_documents - List all available documents in the DOCS_PATH directory
- Returns a formatted list of all documents
- Shows total number of available documents
rag_query - Query documents using RAG
- Takes document_id and query as parameters
- Returns AI-generated responses with context from documents
add_git_repository - Clone a Git repository to the docs directory with optional sparse checkout
- Takes repository_url as parameter
- Optional document_name parameter to customize the name of the document (use simple descriptive names without ‘-docs’ suffix)
- Optional subdirectory parameter for sparse checkout of specific directories
- Automatically pulls latest changes if repository already exists
add_text_file - Download a text file to the docs directory
- Takes file_url as parameter
- Uses wget to download file

Prompts

guide_documents_usage - Guide on how to use documents and RAG functionality
- Includes list of available documents
- Provides usage hints for RAG functionality

Development

Install dependencies:

npm install

Build the server:

npm run build

For development with auto-rebuild:

npm run watch

Setup

This server requires a local directory for storing documents. By default, it uses ~/docs but you can configure a different location with the DOCS_PATH environment variable.

Document Structure

The documents directory can contain:

Git repositories (cloned directories)
Plain text files (with .txt extension)

Each document is indexed separately using llama-index.ts with Google’s Gemini embeddings.

API Keys

This server uses Google’s Gemini API for document indexing and querying. You need to set your Gemini API key as an environment variable:

export GEMINI_API_KEY=your-api-key-here

You can obtain a Gemini API key from the Google AI Studio website. Add this key to your shell profile or include it in the environment configuration for Claude Desktop.

Installation

To use with Claude Desktop, add the server config:

On MacOS: ~/Library/Application Support/Claude/claude_desktop_config.json
On Windows: %APPDATA%/Claude/claude_desktop_config.json
On Linux: ~/.config/Claude/claude_desktop_config.json

{
  "mcpServers": {
    "docs-rag": {
      "command": "npx",
      "args": [
        "-y",
        "@kazuph/mcp-docs-rag"
      ],
      "env": {
        "DOCS_PATH": "/Users/username/docs",
        "GEMINI_API_KEY": "your-api-key-here"
      }
    }
  }
}

Make sure to replace /Users/username/docs with the actual path to your documents directory.

Debugging

Since MCP servers communicate over stdio, debugging can be challenging. We recommend using the MCP Inspector, which is available as a package script:

npm run inspector

The Inspector will provide a URL to access debugging tools in your browser.

Usage

Once configured, you can use the server with Claude to:

Add documents:

Add a new document from GitHub: https://github.com/username/repository

or with a custom document name:

Add GitHub repository https://github.com/username/repository-name and name it 'framework'

or with sparse checkout of a specific directory:

Add only the 'src/components' directory from https://github.com/username/repository

or combine custom name and sparse checkout:

Add the 'examples/demo' directory from https://github.com/username/large-repo and name it 'demo-app'

or add a text file:

Add this text file: https://example.com/document.txt

Query documents:

What does the documentation say about X in the Y repository?

List available documents:
```
What documents do you have access to?
```

The server will automatically handle indexing of documents for efficient retrieval.

Dev Tools Supporting MCP

The following are the main code editors that support the Model Context Protocol. Click the link to visit the official website for more information.

Zed: High-performance collaborative code editor, supports MCP protocol, providing a smooth programming experience. zed.dev

Cursor: AI code editor built on VS Code, supports MCP protocol for context-aware programming. cursor.com

Windsurf: AI code editor from Codeium, integrates MCP protocol to provide intelligent code assistance. windsurf.com

Continue: Open-source AI programming assistant plugin, supports VS Code and JetBrains, compatible with MCP protocol. continue.dev

Trae: AI-driven code editor, supports MCP protocol, focusing on enhancing developer programming experience. trae.ai

View More MCP Dev Tools

Tools

No tools

Comments

Recommend MCP Servers

Tavily MCP Server The Tavily MCP server provides: search, extract, map, crawl tools Real-time web search capabilities through the tavily-search tool Intelligent data extraction from web pages via the tavily-extract tool Powerful web mapping tool that creates a structured map of website Web crawler that systematically explores websites.

MCP Server Chart This is a TypeScript-based MCP server that provides chart generation capabilities. It allows you to create various types of charts through MCP tools. You can also use it in Dify.

GitHub MCP Server MCP Server for the GitHub API, enabling file operations, repository management, search functionality, and more.

Brave Search MCP Server Web and local search using Brave's Search API

Firecrawl MCP Server Advanced web scraping with JavaScript rendering, PDF support, and smart rate limiting

Context7 MCP LLMs rely on outdated or generic information about the libraries you use. You get:

Slack MCP server Channel management and messaging capabilities

Sequential Thinking MCP Server Dynamic and reflective problem-solving through thought sequences

Fetch MCP Server A Model Context Protocol server that provides web content fetching capabilities.

Playwright MCP A Model Context Protocol (MCP) server that provides browser automation capabilities using [Playwright](https://playwright.dev). This server enables LLMs to interact with web pages through structured accessibility snapshots, bypassing the need for screenshots or visually-tuned models.

View All MCP Servers