MCP ExplorerExplorer

Web Voice Assistant

@sjmishra1976on 10 months ago
2 MIT
FreeCommunity
AI Systems
Web-based Voice Assistant that takes mic input, transcribes it using Whisper, feeds it to OpenAI (via your MCP agent), and responds using TTS (like Coqui or Google TTS).

Overview

What is Web Voice Assistant

Web-voice-assistant is a web-based voice assistant that utilizes microphone input to capture voice, transcribes it using Whisper, processes it through OpenAI’s language model, and responds with synthesized speech using Text-to-Speech (TTS) technologies.

Use cases

Use cases include voice-activated customer support, interactive voice response systems, language learning applications, and accessibility tools for individuals with disabilities.

How to use

To use web-voice-assistant, you can either run it locally using Docker or build it from source. After setting it up, users can interact with the assistant by speaking into a microphone or via a phone call, and the assistant will respond with audio output.

Key features

Key features include real-time speech-to-text conversion using Whisper, integration with OpenAI’s language models for intelligent responses, and high-quality text-to-speech output using Coqui or Google TTS.

Where to use

Web-voice-assistant can be used in various fields such as customer service, virtual assistants, educational tools, and any application requiring voice interaction.

Content

High-Level Voice Bot Flow:

User speaks into mic / phone call

Voice stream → Speech-to-Text (STT)

Transcribed text → LLM (e.g., GPT or agentic model)

LLM response → Text-to-Speech (TTS)

Send synthesized voice stream back to user

🎙️ Dockerized Voice Assistant (STT + OpenAI + TTS)

This project uses:

  • 🗣️ Whisper (speech-to-text)
  • 🤖 OpenAI GPT (agentic assistant for fintech)
  • 🔊 Coqui TTS (text-to-speech)
  • ⚡ Node.js Express server + HTML frontend

🚀 Run with Docker only UI part, for full e 2 e to work need to use docker compose.

  • cd web-va
    docker build -t voice-assistant .
    docker run -p 3000:3000 --name va voice-assistant
    Visit: http://localhost:3000
    
    -  For local build:
    npm install
    npm start
    
    
    

From root folder use docker compose to start both your project and ollama, which will use ollama llm image

docker-compose up --build 
  • Once containers are up, use below command to pull model in ollama running server inside docker container.
    docker exec -it ollama-va ollama pull llama3

Tools

No tools

Comments

Recommend MCP Servers

View All MCP Servers