Unsloth Mcp Server

@OtotaOon a year ago

2 MIT

FreeCommunity

AI Systems

An MCP server for Unsloth - a library that makes LLM fine-tuning 2x faster with 80% less memory

What is Unsloth Mcp Server

Unsloth-MCP-Server is an MCP server designed for the Unsloth library, which enhances the fine-tuning of large language models (LLMs) by making it twice as fast and reducing memory usage by 80%.

Use cases

Use cases include fine-tuning models like Llama and Mistral on custom datasets, generating text based on prompts, and exporting fine-tuned models for deployment in various formats.

How to use

To use Unsloth-MCP-Server, install the Unsloth library via pip, build the server using npm, and configure it in your MCP settings by specifying the command and environment variables.

Key features

Key features include optimized fine-tuning for various models, 4-bit quantization for efficient training, extended context length support, a simple API for model operations, and the ability to export models in multiple formats.

Where to use

Unsloth-MCP-Server is primarily used in machine learning and natural language processing fields, particularly for tasks involving large language models that require efficient fine-tuning and inference.

Clients Supporting MCP

The following are the main client software that supports the Model Context Protocol. Click the link to visit the official website for more information.

Claude Desktop: Official desktop application from Anthropic, natively supports MCP protocol. claude.ai

Cherry Studio: Cross-platform desktop client supporting multiple LLM providers, built-in MCP server support. cherry-ai.com

LobeChat: Modern open-source ChatGPT/LLMs UI, supports MCP protocol integration. lobehub.com

DeepChat: Cross-platform desktop AI assistant, compatible with MCP protocol, focusing on privacy and efficiency. deepchat.thinkinai.xyz

5ire: Cross-platform open-source desktop intelligent assistant MCP client, supports local knowledge base and MCP server. 5ire.app

View More MCP Clients

Overview

What is Unsloth Mcp Server

Unsloth-MCP-Server is an MCP server designed for the Unsloth library, which enhances the fine-tuning of large language models (LLMs) by making it twice as fast and reducing memory usage by 80%.

Use cases

Use cases include fine-tuning models like Llama and Mistral on custom datasets, generating text based on prompts, and exporting fine-tuned models for deployment in various formats.

How to use

To use Unsloth-MCP-Server, install the Unsloth library via pip, build the server using npm, and configure it in your MCP settings by specifying the command and environment variables.

Key features

Where to use

Clients Supporting MCP

The following are the main client software that supports the Model Context Protocol. Click the link to visit the official website for more information.

Claude Desktop: Official desktop application from Anthropic, natively supports MCP protocol. claude.ai

Cherry Studio: Cross-platform desktop client supporting multiple LLM providers, built-in MCP server support. cherry-ai.com

LobeChat: Modern open-source ChatGPT/LLMs UI, supports MCP protocol integration. lobehub.com

DeepChat: Cross-platform desktop AI assistant, compatible with MCP protocol, focusing on privacy and efficiency. deepchat.thinkinai.xyz

5ire: Cross-platform open-source desktop intelligent assistant MCP client, supports local knowledge base and MCP server. 5ire.app

View More MCP Clients

Content

Unsloth MCP Server

An MCP server for Unsloth - a library that makes LLM fine-tuning 2x faster with 80% less memory.

What is Unsloth?

Unsloth is a library that dramatically improves the efficiency of fine-tuning large language models:

Speed: 2x faster fine-tuning compared to standard methods
Memory: 80% less VRAM usage, allowing fine-tuning of larger models on consumer GPUs
Context Length: Up to 13x longer context lengths (e.g., 89K tokens for Llama 3.3 on 80GB GPUs)
Accuracy: No loss in model quality or performance

Unsloth achieves these improvements through custom CUDA kernels written in OpenAI’s Triton language, optimized backpropagation, and dynamic 4-bit quantization.

Features

Optimize fine-tuning for Llama, Mistral, Phi, Gemma, and other models
4-bit quantization for efficient training
Extended context length support
Simple API for model loading, fine-tuning, and inference
Export to various formats (GGUF, Hugging Face, etc.)

Quick Start

Install Unsloth: pip install unsloth

Install and build the server:

cd unsloth-server
npm install
npm run build

Add to MCP settings:

Available Tools

check_installation

Verify if Unsloth is properly installed on your system.

Parameters: None

Example:

const result = await use_mcp_tool({
  server_name: "unsloth-server",
  tool_name: "check_installation",
  arguments: {}
});

list_supported_models

Get a list of all models supported by Unsloth, including Llama, Mistral, Phi, and Gemma variants.

Parameters: None

Example:

const result = await use_mcp_tool({
  server_name: "unsloth-server",
  tool_name: "list_supported_models",
  arguments: {}
});

load_model

Load a pretrained model with Unsloth optimizations for faster inference and fine-tuning.

Parameters:

model_name (required): Name of the model to load (e.g., “unsloth/Llama-3.2-1B”)
max_seq_length (optional): Maximum sequence length for the model (default: 2048)
load_in_4bit (optional): Whether to load the model in 4-bit quantization (default: true)
use_gradient_checkpointing (optional): Whether to use gradient checkpointing to save memory (default: true)

Example:

const result = await use_mcp_tool({
  server_name: "unsloth-server",
  tool_name: "load_model",
  arguments: {
    model_name: "unsloth/Llama-3.2-1B",
    max_seq_length: 4096,
    load_in_4bit: true
  }
});

finetune_model

Fine-tune a model with Unsloth optimizations using LoRA/QLoRA techniques.

Parameters:

model_name (required): Name of the model to fine-tune
dataset_name (required): Name of the dataset to use for fine-tuning
output_dir (required): Directory to save the fine-tuned model
max_seq_length (optional): Maximum sequence length for training (default: 2048)
lora_rank (optional): Rank for LoRA fine-tuning (default: 16)
lora_alpha (optional): Alpha for LoRA fine-tuning (default: 16)
batch_size (optional): Batch size for training (default: 2)
gradient_accumulation_steps (optional): Number of gradient accumulation steps (default: 4)
learning_rate (optional): Learning rate for training (default: 2e-4)
max_steps (optional): Maximum number of training steps (default: 100)
dataset_text_field (optional): Field in the dataset containing the text (default: ‘text’)
load_in_4bit (optional): Whether to use 4-bit quantization (default: true)

Example:

const result = await use_mcp_tool({
  server_name: "unsloth-server",
  tool_name: "finetune_model",
  arguments: {
    model_name: "unsloth/Llama-3.2-1B",
    dataset_name: "tatsu-lab/alpaca",
    output_dir: "./fine-tuned-model",
    max_steps: 100,
    batch_size: 2,
    learning_rate: 2e-4
  }
});

generate_text

Generate text using a fine-tuned Unsloth model.

Parameters:

model_path (required): Path to the fine-tuned model
prompt (required): Prompt for text generation
max_new_tokens (optional): Maximum number of tokens to generate (default: 256)
temperature (optional): Temperature for text generation (default: 0.7)
top_p (optional): Top-p for text generation (default: 0.9)

Example:

const result = await use_mcp_tool({
  server_name: "unsloth-server",
  tool_name: "generate_text",
  arguments: {
    model_path: "./fine-tuned-model",
    prompt: "Write a short story about a robot learning to paint:",
    max_new_tokens: 512,
    temperature: 0.8
  }
});

export_model

Export a fine-tuned Unsloth model to various formats for deployment.

Parameters:

model_path (required): Path to the fine-tuned model
export_format (required): Format to export to (gguf, ollama, vllm, huggingface)
output_path (required): Path to save the exported model
quantization_bits (optional): Bits for quantization (for GGUF export) (default: 4)

Example:

const result = await use_mcp_tool({
  server_name: "unsloth-server",
  tool_name: "export_model",
  arguments: {
    model_path: "./fine-tuned-model",
    export_format: "gguf",
    output_path: "./exported-model.gguf",
    quantization_bits: 4
  }
});

Advanced Usage

Custom Datasets

You can use custom datasets by formatting them properly and hosting them on Hugging Face or providing a local path:

const result = await use_mcp_tool({
  server_name: "unsloth-server",
  tool_name: "finetune_model",
  arguments: {
    model_name: "unsloth/Llama-3.2-1B",
    dataset_name: "json",
    data_files: {"train": "path/to/your/data.json"},
    output_dir: "./fine-tuned-model"
  }
});

Memory Optimization

For large models on limited hardware:

Reduce batch size and increase gradient accumulation steps
Use 4-bit quantization
Enable gradient checkpointing
Reduce sequence length if possible

Troubleshooting

Common Issues

CUDA Out of Memory: Reduce batch size, use 4-bit quantization, or try a smaller model
Import Errors: Ensure you have the correct versions of torch, transformers, and unsloth installed
Model Not Found: Check that you’re using a supported model name or have access to private models

Version Compatibility

Python: 3.10, 3.11, or 3.12 (not 3.13)
CUDA: 11.8 or 12.1+ recommended
PyTorch: 2.0+ recommended

Performance Benchmarks

Model	VRAM	Unsloth Speed	VRAM Reduction	Context Length
Llama 3.3 (70B)	80GB	2x faster	>75%	13x longer
Llama 3.1 (8B)	80GB	2x faster	>70%	12x longer
Mistral v0.3 (7B)	80GB	2.2x faster	75% less	-

Requirements

Python 3.10-3.12
NVIDIA GPU with CUDA support (recommended)
Node.js and npm

License

Apache-2.0

Dev Tools Supporting MCP

The following are the main code editors that support the Model Context Protocol. Click the link to visit the official website for more information.

Zed: High-performance collaborative code editor, supports MCP protocol, providing a smooth programming experience. zed.dev

Cursor: AI code editor built on VS Code, supports MCP protocol for context-aware programming. cursor.com

Windsurf: AI code editor from Codeium, integrates MCP protocol to provide intelligent code assistance. windsurf.com

Continue: Open-source AI programming assistant plugin, supports VS Code and JetBrains, compatible with MCP protocol. continue.dev

Trae: AI-driven code editor, supports MCP protocol, focusing on enhancing developer programming experience. trae.ai

View More MCP Dev Tools

Tools

No tools

Comments

Recommend MCP Servers

Tavily MCP Server The Tavily MCP server provides: search, extract, map, crawl tools Real-time web search capabilities through the tavily-search tool Intelligent data extraction from web pages via the tavily-extract tool Powerful web mapping tool that creates a structured map of website Web crawler that systematically explores websites.

MCP Server Chart This is a TypeScript-based MCP server that provides chart generation capabilities. It allows you to create various types of charts through MCP tools. You can also use it in Dify.

GitHub MCP Server MCP Server for the GitHub API, enabling file operations, repository management, search functionality, and more.

Brave Search MCP Server Web and local search using Brave's Search API

Firecrawl MCP Server Advanced web scraping with JavaScript rendering, PDF support, and smart rate limiting

Context7 MCP LLMs rely on outdated or generic information about the libraries you use. You get:

Slack MCP server Channel management and messaging capabilities

Sequential Thinking MCP Server Dynamic and reflective problem-solving through thought sequences

Fetch MCP Server A Model Context Protocol server that provides web content fetching capabilities.

Playwright MCP A Model Context Protocol (MCP) server that provides browser automation capabilities using [Playwright](https://playwright.dev). This server enables LLMs to interact with web pages through structured accessibility snapshots, bypassing the need for screenshots or visually-tuned models.

View All MCP Servers