- Explore MCP Servers
- llm-wrapper-mcp-server
Llm Wrapper Mcp Server
What is Llm Wrapper Mcp Server
The llm-wrapper-mcp-server is a Model Context Protocol (MCP) server wrapper that allows any MCP-capable Large Language Model (LLM) agent to communicate with or delegate tasks to other LLMs available through the OpenRouter.ai API.
Use cases
Use cases include developing chatbots that leverage multiple LLMs for enhanced responses, automating content generation by delegating tasks to different LLMs, and conducting research that requires querying various language models for diverse insights.
How to use
To use the llm-wrapper-mcp-server, developers can integrate it into their applications by setting up the STDIO-based server to handle LLM requests and responses. Configuration involves specifying the API base URL and model parameters for various LLM providers.
Key features
Key features include compliance with the MCP specification for standardized interactions, support for tool calls and result processing, extensibility for integrating new LLM backends, and integration with llm-accounting for logging, rate limiting, and audit functionalities.
Where to use
The llm-wrapper-mcp-server can be used in various fields such as software development, AI research, and any application that requires interaction with multiple LLMs for tasks like natural language processing, data analysis, and automated customer support.
Clients Supporting MCP
The following are the main client software that supports the Model Context Protocol. Click the link to visit the official website for more information.
Overview
What is Llm Wrapper Mcp Server
The llm-wrapper-mcp-server is a Model Context Protocol (MCP) server wrapper that allows any MCP-capable Large Language Model (LLM) agent to communicate with or delegate tasks to other LLMs available through the OpenRouter.ai API.
Use cases
Use cases include developing chatbots that leverage multiple LLMs for enhanced responses, automating content generation by delegating tasks to different LLMs, and conducting research that requires querying various language models for diverse insights.
How to use
To use the llm-wrapper-mcp-server, developers can integrate it into their applications by setting up the STDIO-based server to handle LLM requests and responses. Configuration involves specifying the API base URL and model parameters for various LLM providers.
Key features
Key features include compliance with the MCP specification for standardized interactions, support for tool calls and result processing, extensibility for integrating new LLM backends, and integration with llm-accounting for logging, rate limiting, and audit functionalities.
Where to use
The llm-wrapper-mcp-server can be used in various fields such as software development, AI research, and any application that requires interaction with multiple LLMs for tasks like natural language processing, data analysis, and automated customer support.
Clients Supporting MCP
The following are the main client software that supports the Model Context Protocol. Click the link to visit the official website for more information.
Content
LLM Wrapper MCP Server
“Allow any MCP-capable LLM agent to communicate with or delegate tasks to any other LLM available through the OpenRouter.ai API.”
A Model Context Protocol (MCP) server wrapper designed to facilitate seamless interaction with various Large Language Models (LLMs) through a standardized interface. This project enables developers to integrate LLM capabilities into their applications by providing a robust and flexible STDIO-based server that handles LLM calls, tool execution, and result processing.
Features
- Implements the Model Context Protocol (MCP) specification for standardized LLM interactions.
- Provides an STDIO-based server for handling LLM requests and responses via standard input/output.
- Supports advanced features like tool calls and results through the MCP protocol.
- Configurable to use various LLM providers (e.g., OpenRouter, local models) via API base URL and model parameters.
- Designed for extensibility, allowing easy integration of new LLM backends.
- Integrates with
llm-accounting
for robust logging, rate limiting, and audit functionalities, enabling monitoring of remote LLM usage, inference costs, and inspection of queries/responses for debugging or legal purposes.
Dependencies
This project relies on the following key dependencies:
Core Dependencies
pydantic
: Data validation and settings management using Python type hints.pydantic-settings
: Pydantic’s settings management for environment variables and configuration.python-dotenv
: Reads key-value pairs from a.env
file and sets them as environment variables.requests
: An elegant and simple HTTP library for Python.tiktoken
: A fast BPE tokeniser for use with OpenAI’s models.llm-accounting
: For robust logging, rate limiting, and audit functionalities.
Development Dependencies
-
pytest
: A mature full-featured Python testing framework. -
black
: An uncompromising Python code formatter. -
isort
: A Python utility / library to sort imports alphabetically, and automatically separate into sections and by type. -
mypy
: An optional static type checker for Python. -
pytest-mock
: A pytest plugin that provides amocker
fixture for easier mocking.
Installation
The llm-wrapper-mcp-server
package is available on PyPI and can be installed via pip:
pip install llm-wrapper-mcp-server
Alternatively, for local development or to install from source:
- Create and activate a virtual environment:
python -m venv .venv
source .venv/bin/activate # On Windows: .venv\Scripts\activate
- Install the package:
pip install -e .
Configuration
Create a .env
file in the project root with the following variable:
OPENROUTER_API_KEY=your_openrouter_api_key_here
The server is configured to use OpenRouter by default. The API key is loaded from the OPENROUTER_API_KEY
environment variable. The specific LLM model and API base URL are primarily configured via command-line arguments when running the server (see below).
Default settings if not overridden by CLI arguments:
- API Base URL for LLMClient: https://openrouter.ai/api/v1 (can be overridden by
LLM_API_BASE_URL
env var or--llm-api-base-url
CLI arg) - Default Model for LLMClient: perplexity/llama-3.1-sonar-small-128k-online (can be overridden by
--model
CLI arg)
Usage
Textual Overview:
- Agent Software communicates with the LLM Wrapper MCP Server via the MCP Protocol (stdin/stdout).
- The LLM Wrapper MCP Server interacts with LLM providers (e.g., OpenRouter.ai) for LLM API calls and responses.
- The server also integrates with an LLM Accounting System for logging and auditing.
- Main components:
- MCP Communication Handler
- LLM Client
- Tool Executor
- LLM Accounting Integration
Running the Server
To run the server, execute the following command:
python -m llm_wrapper_mcp_server [OPTIONS]
For example:
python -m llm_wrapper_mcp_server --model your-org/your-model-name --log-level DEBUG
Run python -m llm_wrapper_mcp_server --help
to see all available command-line options for configuring the server.
This server operates as a Model Context Protocol (MCP) STDIO server, communicating via standard input and output. It does not open a network port for MCP communication.
MCP Communication
The server communicates using JSON-RPC messages over stdin
and stdout
. It supports the following MCP methods:
initialize
: Handshake to establish protocol version and server capabilities.tools/list
: Lists available tools. The main server provides anllm_call
tool.tools/call
: Executes a specified tool.resources/list
: Lists available resources (currently none).resources/templates/list
: Lists available resource templates (currently none).
The llm_call
tool takes prompt
(string, required) and optionally model
(string) as arguments to allow per-call model overrides if the specified model is permitted.
Client Interaction Example (Python)
You can interact with the STDIO MCP server using any language that supports standard input/output communication. Here’s a Python example using the subprocess
module:
import subprocess
import json
import time
def send_request(process, request):
"""Sends a JSON-RPC request to the server's stdin."""
request_str = json.dumps(request) + "\\n"
process.stdin.write(request_str.encode('utf-8'))
process.stdin.flush()
def read_response(process):
"""Reads a JSON-RPC response from the server's stdout."""
line = process.stdout.readline().decode('utf-8').strip()
if line:
return json.loads(line)
return None
if __name__ == "__main__":
# Start the MCP server as a subprocess
# Ensure you have the virtual environment activated or the package installed globally
server_process = subprocess.Popen(
["python", "-m", "llm_wrapper_mcp_server"], # Add any CLI args here if needed
stdin=subprocess.PIPE,
stdout=subprocess.PIPE,
stderr=subprocess.PIPE, # Capture stderr for debugging
text=False # Use bytes for stdin/stdout
)
print("Waiting for server to initialize...")
# The server sends an initial capabilities message on startup (id: None)
initial_response = read_response(server_process)
print(f"Server Initial Response: {json.dumps(initial_response, indent=2)}")
# 1. Send an 'initialize' request
initialize_request = {
"jsonrpc": "2.0",
"id": "1",
"method": "initialize",
"params": {}
}
print("\\nSending initialize request...")
send_request(server_process, initialize_request)
initialize_response = read_response(server_process)
print(f"Initialize Response: {json.dumps(initialize_response, indent=2)}")
# 2. Send a 'tools/call' request to use the 'llm_call' tool
llm_call_request = {
"jsonrpc": "2.0",
"id": "2",
"method": "tools/call",
"params": {
"name": "llm_call",
"arguments": {
"prompt": "What is the capital of France?"
# Optionally add: "model": "another-model/if-allowed"
}
}
}
print("\\nSending llm_call request...")
send_request(server_process, llm_call_request)
llm_call_response = read_response(server_process)
print(f"LLM Call Response: {json.dumps(llm_call_response, indent=2)}")
# You can also read stderr for any server logs/errors
# Note: stderr might block if there's no output, consider using non-blocking reads or threads for real apps
# stderr_output = server_process.stderr.read().decode('utf-8')
# if stderr_output:
# print("\\nServer Stderr Output:\\n", stderr_output)
# Terminate the server process
server_process.terminate()
server_process.wait(timeout=5) # Wait for process to terminate
print("\\nServer process terminated.")
Development
For a detailed overview of the project’s directory and file structure, and crucial guidelines for software development agents, refer to AGENTS.md. This document is essential for agents contributing to the codebase.
Running Tests
This project uses pytest
for testing.
To run all unit tests:
pytest
Integration tests are disabled by default to avoid making external API calls during normal test runs. To include and run integration tests, use the integration
marker:
pytest -m integration
Install Development Dependencies
Install development dependencies:
pip install -e ".[dev]"
License
MIT License
Dev Tools Supporting MCP
The following are the main code editors that support the Model Context Protocol. Click the link to visit the official website for more information.