Mcp Rag Server

4 MIT

FreeCommunity

AI Systems

#mcp-server#rag

What is Mcp Rag Server

mcp-rag-server is a Model Context Protocol server that enables Retrieval Augmented Generation (RAG) by indexing documents and providing relevant context to Large Language Models through the MCP protocol.

Use cases

This tool is particularly useful for applications requiring efficient document retrieval based on user queries, enabling better responses and insights from Large Language Models by utilizing stored context from indexed documents.

How to use

To use the mcp-rag-server, install it globally via npm or clone the repository and build it from source. Set environment variables for the Base LLM API, embedding model, vector store path, and chunk size. Run the server using ‘mcp-rag-server’ or ‘npx mcp-rag-server’. Index documents using the provided MCP tools and query them as needed.

Key features

Key features include the ability to index multiple document formats, customizable chunk sizes, a local SQLite vector store, support for various embedding providers, and exposed MCP tools for seamless integration with clients.

Where to use

mcp-rag-server can be utilized in environments where Large Language Models are deployed, such as chatbots, virtual assistants, and knowledge management systems, enabling these systems to provide contextual responses based on indexed document content.

Clients Supporting MCP

The following are the main client software that supports the Model Context Protocol. Click the link to visit the official website for more information.

Claude Desktop: Official desktop application from Anthropic, natively supports MCP protocol. claude.ai

Cherry Studio: Cross-platform desktop client supporting multiple LLM providers, built-in MCP server support. cherry-ai.com

LobeChat: Modern open-source ChatGPT/LLMs UI, supports MCP protocol integration. lobehub.com

DeepChat: Cross-platform desktop AI assistant, compatible with MCP protocol, focusing on privacy and efficiency. deepchat.thinkinai.xyz

5ire: Cross-platform open-source desktop intelligent assistant MCP client, supports local knowledge base and MCP server. 5ire.app

View More MCP Clients

Overview

What is Mcp Rag Server

Use cases

How to use

Key features

Where to use

Clients Supporting MCP

The following are the main client software that supports the Model Context Protocol. Click the link to visit the official website for more information.

Claude Desktop: Official desktop application from Anthropic, natively supports MCP protocol. claude.ai

Cherry Studio: Cross-platform desktop client supporting multiple LLM providers, built-in MCP server support. cherry-ai.com

LobeChat: Modern open-source ChatGPT/LLMs UI, supports MCP protocol integration. lobehub.com

DeepChat: Cross-platform desktop AI assistant, compatible with MCP protocol, focusing on privacy and efficiency. deepchat.thinkinai.xyz

5ire: Cross-platform open-source desktop intelligent assistant MCP client, supports local knowledge base and MCP server. 5ire.app

View More MCP Clients

Content

mcp-rag-server

A Model Context Protocol (MCP) server that enables Retrieval Augmented Generation (RAG). It indexes your documents and serves relevant context to Large Language Models via the MCP protocol.

Integration Examples

Generic MCP Client Configuration

{
  "mcpServers": {
    "rag": {
      "command": "npx",
      "args": [
        "-y",
        "mcp-rag-server"
      ],
      "env": {
        "BASE_LLM_API": "http://localhost:11434/v1",
        "EMBEDDING_MODEL": "nomic-embed-text",
        "VECTOR_STORE_PATH": "./vector_store",
        "CHUNK_SIZE": "500"
      }
    }
  }
}

Example Interaction

# Index documents
>> tool:embedding_documents {"path":"./docs"}

# Check status
>> resource:embedding-status

<< rag://embedding/status
Current Path: ./docs/file1.md
Completed: 10
Failed: 0
Total chunks: 15
Failed Reason:

Integration Examples
Features
Installation
Quick Start
Configuration
Usage
- MCP Tools
- MCP Resources
How RAG Works
Development
Contributing
License

Features

Index documents in .txt, .md, .json, .jsonl, and .csv formats
Customizable chunk size for splitting text
Local vector store powered by SQLite (via LangChain’s LibSQLVectorStore)
Supports multiple embedding providers (OpenAI, Ollama, Granite, Nomic)
Exposes MCP tools and resources over stdio for seamless integration with MCP clients

Installation

From npm

npm install -g mcp-rag-server

From Source

git clone https://github.com/kwanLeeFrmVi/mcp-rag-server.git
cd mcp-rag-server
npm install
npm run build
npm start

Quick Start

export BASE_LLM_API=http://localhost:11434/v1
export EMBEDDING_MODEL=granite-embedding-278m-multilingual-Q6_K-1743674737397:latest
export VECTOR_STORE_PATH=./vector_store
export CHUNK_SIZE=500

# Run (global install)
mcp-rag-server

# Or via npx
npx mcp-rag-server

💡 Tip: We recommend using Ollama for embedding. Install and pull the nomic-embed-text model:

ollama pull nomic-embed-text
export EMBEDDING_MODEL=nomic-embed-text

Configuration

Variable	Description	Default
`BASE_LLM_API`	Base URL for embedding API	`http://localhost:11434/v1`
`LLM_API_KEY`	API key for your LLM provider	(empty)
`EMBEDDING_MODEL`	Embedding model identifier	`nomic-embed-text`
`VECTOR_STORE_PATH`	Directory for local vector store	`./vector_store`
`CHUNK_SIZE`	Characters per text chunk (number)	`500`

💡 Recommendation: Use Ollama embedding models like nomic-embed-text for best performance.

Usage

MCP Tools

Once running, the server exposes these tools via MCP:

embedding_documents(path: string): Index documents under the given path
query_documents(query: string, k?: number): Retrieve top k chunks (default 15)
remove_document(path: string): Remove a specific document
remove_all_documents(confirm: boolean): Clear the entire index (confirm=true)
list_documents(): List all indexed document paths

MCP Resources

Clients can also read resources via URIs:

rag://documents — List all document URIs
rag://document/{path} — Fetch full content of a document
rag://query-document/{numberOfChunks}/{query} — Query documents as a resource
rag://embedding/status — Check current indexing status (completed, failed, total)

How RAG Works

Indexing: Reads files, splits text into chunks based on CHUNK_SIZE, and queues them for embedding.
Embedding: Processes each chunk sequentially against the embedding API, storing vectors in SQLite.
Querying: Embeds the query and retrieves nearest text chunks from the vector store, returning them to the client.

Development

npm install
npm run build      # Compile TypeScript
npm start          # Run server
npm run watch      # Watch for changes

Contributing

Contributions are welcome! Please open issues or pull requests on GitHub.

License

MIT 2025 Quan Le

Dev Tools Supporting MCP

The following are the main code editors that support the Model Context Protocol. Click the link to visit the official website for more information.

Zed: High-performance collaborative code editor, supports MCP protocol, providing a smooth programming experience. zed.dev

Cursor: AI code editor built on VS Code, supports MCP protocol for context-aware programming. cursor.com

Windsurf: AI code editor from Codeium, integrates MCP protocol to provide intelligent code assistance. windsurf.com

Continue: Open-source AI programming assistant plugin, supports VS Code and JetBrains, compatible with MCP protocol. continue.dev

Trae: AI-driven code editor, supports MCP protocol, focusing on enhancing developer programming experience. trae.ai

View More MCP Dev Tools

Tools

No tools

Comments

Recommend MCP Servers

Tavily MCP Server The Tavily MCP server provides: search, extract, map, crawl tools Real-time web search capabilities through the tavily-search tool Intelligent data extraction from web pages via the tavily-extract tool Powerful web mapping tool that creates a structured map of website Web crawler that systematically explores websites.

MCP Server Chart This is a TypeScript-based MCP server that provides chart generation capabilities. It allows you to create various types of charts through MCP tools. You can also use it in Dify.

GitHub MCP Server MCP Server for the GitHub API, enabling file operations, repository management, search functionality, and more.

Brave Search MCP Server Web and local search using Brave's Search API

Firecrawl MCP Server Advanced web scraping with JavaScript rendering, PDF support, and smart rate limiting

Context7 MCP LLMs rely on outdated or generic information about the libraries you use. You get:

Slack MCP server Channel management and messaging capabilities

Sequential Thinking MCP Server Dynamic and reflective problem-solving through thought sequences

Fetch MCP Server A Model Context Protocol server that provides web content fetching capabilities.

Playwright MCP A Model Context Protocol (MCP) server that provides browser automation capabilities using [Playwright](https://playwright.dev). This server enables LLMs to interact with web pages through structured accessibility snapshots, bypassing the need for screenshots or visually-tuned models.

View All MCP Servers

Mcp Rag Server

What is Mcp Rag Server

Use cases

How to use

Key features

Where to use

Clients Supporting MCP

Overview

What is Mcp Rag Server

Use cases

How to use

Key features

Where to use

Clients Supporting MCP

Content

mcp-rag-server

Integration Examples

Table of Contents

Features

Installation

Quick Start

Configuration

Usage

How RAG Works

Development

Contributing

License

Dev Tools Supporting MCP

Tools

Comments

Recommend MCP Servers