Langgraph Voice Agent

2 MIT

FreeCommunity

AI Systems

Convert user audio messages into text and convert text responses from the agent back to audio (text-to-speech) to play to the user. The AI voice agent is an expense manager with access to an MCP server with tools that allow CRUD (create, read, update, delete) operations on a database that tracks expenses.

What is Langgraph Voice Agent

langgraph-voice-agent is a voice-enabled AI assistant designed to help users manage their expenses through natural conversation. It converts user audio messages into text and provides text responses in audio format using text-to-speech technology.

Use cases

Use cases include managing daily expenses, querying past transactions, updating expense records, and categorizing expenses based on user-defined criteria.

How to use

To use langgraph-voice-agent, clone the repository, set up a virtual environment, install the necessary dependencies, and configure environment variables. Users can interact with the agent by speaking to it, and it will respond with audio.

Key features

Key features include voice interaction, expense management capabilities (CRUD operations), automatic categorization of expenses, database integration with PostgreSQL, and a powerful agent framework for complex reasoning.

Where to use

langgraph-voice-agent can be used in personal finance management, small business expense tracking, and any scenario where users prefer voice interaction for managing financial data.

Clients Supporting MCP

The following are the main client software that supports the Model Context Protocol. Click the link to visit the official website for more information.

Claude Desktop: Official desktop application from Anthropic, natively supports MCP protocol. claude.ai

Cherry Studio: Cross-platform desktop client supporting multiple LLM providers, built-in MCP server support. cherry-ai.com

LobeChat: Modern open-source ChatGPT/LLMs UI, supports MCP protocol integration. lobehub.com

DeepChat: Cross-platform desktop AI assistant, compatible with MCP protocol, focusing on privacy and efficiency. deepchat.thinkinai.xyz

5ire: Cross-platform open-source desktop intelligent assistant MCP client, supports local knowledge base and MCP server. 5ire.app

View More MCP Clients

Overview

What is Langgraph Voice Agent

Use cases

Use cases include managing daily expenses, querying past transactions, updating expense records, and categorizing expenses based on user-defined criteria.

How to use

Key features

Where to use

langgraph-voice-agent can be used in personal finance management, small business expense tracking, and any scenario where users prefer voice interaction for managing financial data.

Clients Supporting MCP

The following are the main client software that supports the Model Context Protocol. Click the link to visit the official website for more information.

Claude Desktop: Official desktop application from Anthropic, natively supports MCP protocol. claude.ai

Cherry Studio: Cross-platform desktop client supporting multiple LLM providers, built-in MCP server support. cherry-ai.com

LobeChat: Modern open-source ChatGPT/LLMs UI, supports MCP protocol integration. lobehub.com

DeepChat: Cross-platform desktop AI assistant, compatible with MCP protocol, focusing on privacy and efficiency. deepchat.thinkinai.xyz

5ire: Cross-platform open-source desktop intelligent assistant MCP client, supports local knowledge base and MCP server. 5ire.app

View More MCP Clients

Content

Luna: Voice-Enabled Expense Management Agent

Luna is a voice-enabled AI assistant built with Langgraph that helps users manage their expenses through natural conversation. This project demonstrates how to create a voice interface for any Langgraph agent, combining speech-to-text and text-to-speech capabilities with a powerful agent framework.

🌟 Features

Voice Interaction: Speak to Luna and hear responses through high-quality text-to-speech
Expense Management: Create, query, update, and delete expenses through natural conversation
Category Classification: Automatically categorizes expenses based on descriptions
Database Integration: Stores expense data in a PostgreSQL database (via Supabase)
Tool-using Agent: Built with Langgraph’s agent framework for complex reasoning

🛠️ Technology Stack

Backend

Python 3.13: Core language for the backend
Langgraph: Agent framework for building the conversational AI
OpenAI:
- Whisper API for speech-to-text
- GPT-4 Mini for the agent’s reasoning
- TTS API for text-to-speech responses
MCP (Model Calling Protocol): For defining and using tools
SQLAlchemy: ORM for database interactions
Supabase: PostgreSQL database provider

Audio Processing

sounddevice: For capturing audio from microphone
scipy: For audio file processing

📋 Prerequisites

Python 3.13
OpenAI API key
Supabase account and database
Microphone and speakers

🚀 Getting Started

1. Clone the repository

git clone https://github.com/rosiefaulkner/langgraph-voice-agent.git
cd langgraph-voice-agent

2. Set up a virtual environment and install dependencies

(Recommended) use uv for dependency management

Setup the venv in your project directory and install all dependencies with one command.

uv sync

3. Set up environment variables

Create a .env file in the root directory with the following variables:

OPENAI_API_KEY=your_openai_api_key
SUPABASE_URI=postgresql://postgres:[email protected]:5432/postgres

4. Run the application

python main.py

🎤 Using Luna

Run the application
When prompted, speak your request (e.g., “Create a new expense for lunch today that cost $15”)
Press Enter to stop recording
Luna will process your request, interact with the database if needed, and respond verbally
Continue the conversation or say “exit” or “quit” to end the session

🧩 Project Structure

langgraph-voice-agent/
├── main.py                  # Main application entry point
├── assistant_graph.py       # Langgraph agent definition
├── state.py                 # State management for the agent
├── voice_utils.py           # Audio recording and playback utilities
├── mcps/                    # Model Calling Protocol servers
│   ├── mcp_config.json      # MCP server configuration
│   └── local_servers/
│       └── db.py            # Database tools implementation
├── .env                     # Environment variables (not in repo)
├── .env.example             # Example environment variables
└── pyproject.toml           # Project dependencies

🔧 Customizing the Agent

Modifying the System Prompt

To change Luna’s personality or capabilities, edit the system_prompt in assistant_graph.py:

system_prompt = """You are Luna, the company's expense manager...

Adding New Tools

Create a new MCP server or add tools to the existing one in mcps/local_servers/
Register the server in mcps/mcp_config.json
The tools will be automatically available to the agent

Changing Voice Settings

Modify the TTS settings in voice_utils.py:

async def play_audio(message: str):
    # ...
    async with openai_async.audio.speech.with_streaming_response.create(
        model="gpt-4o-mini-tts",
        voice="fable",  # Change the voice here
        input=cleaned_message,
        instructions="Speak in a cheerful, helpful tone with a brisk pace.",  # Modify instructions
        response_format="pcm",
        speed=1.2,  # Adjust speed
    ) as response:
        # ...

📚 Resources

🤝 Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

Dev Tools Supporting MCP

The following are the main code editors that support the Model Context Protocol. Click the link to visit the official website for more information.

Zed: High-performance collaborative code editor, supports MCP protocol, providing a smooth programming experience. zed.dev

Cursor: AI code editor built on VS Code, supports MCP protocol for context-aware programming. cursor.com

Windsurf: AI code editor from Codeium, integrates MCP protocol to provide intelligent code assistance. windsurf.com

Continue: Open-source AI programming assistant plugin, supports VS Code and JetBrains, compatible with MCP protocol. continue.dev

Trae: AI-driven code editor, supports MCP protocol, focusing on enhancing developer programming experience. trae.ai

View More MCP Dev Tools

Tools

No tools

Comments

Recommend MCP Servers

Tavily MCP Server The Tavily MCP server provides: search, extract, map, crawl tools Real-time web search capabilities through the tavily-search tool Intelligent data extraction from web pages via the tavily-extract tool Powerful web mapping tool that creates a structured map of website Web crawler that systematically explores websites.

MCP Server Chart This is a TypeScript-based MCP server that provides chart generation capabilities. It allows you to create various types of charts through MCP tools. You can also use it in Dify.

GitHub MCP Server MCP Server for the GitHub API, enabling file operations, repository management, search functionality, and more.

Brave Search MCP Server Web and local search using Brave's Search API

Firecrawl MCP Server Advanced web scraping with JavaScript rendering, PDF support, and smart rate limiting

Context7 MCP LLMs rely on outdated or generic information about the libraries you use. You get:

Slack MCP server Channel management and messaging capabilities

Sequential Thinking MCP Server Dynamic and reflective problem-solving through thought sequences

Fetch MCP Server A Model Context Protocol server that provides web content fetching capabilities.

Playwright MCP A Model Context Protocol (MCP) server that provides browser automation capabilities using [Playwright](https://playwright.dev). This server enables LLMs to interact with web pages through structured accessibility snapshots, bypassing the need for screenshots or visually-tuned models.

View All MCP Servers