MCP ExplorerExplorer

Talk To Your Slides

@KyuDan1on 9 months ago
7 MIT
FreeCommunity
AI Systems
#agent#llm#mcp#powerpoint
PowerPoint Editing Agent

Overview

What is Talk To Your Slides

Talk-to-Your-Slides is a real-time PowerPoint editing agent that utilizes large language models to automate modifications to PowerPoint presentations based on natural language commands.

Use cases

Use cases include updating slide formats, changing text colors, correcting typos, and applying consistent styling across presentations based on user commands.

How to use

Users can interact with the Talk-to-Your-Slides agent by inputting commands in natural language while PowerPoint is open. The agent processes these commands and updates the presentation accordingly.

Key features

Key features include real-time modification of PowerPoint presentations, natural language command processing, a structured workflow involving planning, parsing, processing, applying changes, and reporting back to the user.

Where to use

Talk-to-Your-Slides can be used in various fields such as education, business presentations, and any scenario where effective and efficient PowerPoint editing is required.

Content


📜 Talk to Your Slides:

Language-Driven Agents for Efficient Slide Editing

Stars


📄 Research Paper (arXiv preprint)


📖 Overview

Editing presentation slides remains one of the most common and time-consuming tasks faced by millions of users daily, despite significant advances in automated slide generation.

While GUI-based agents have demonstrated visual control capabilities, they often suffer from high computational cost and latency. To address this, we propose Talk-to-Your-Slides, an LLM-powered agent that edits slides in active PowerPoint sessions by leveraging structured object-level information—bypassing the need for visual pixel interaction.

Our system introduces a hierarchical editing design, separating high-level semantic planning from low-level object manipulation. This allows:

  • 🚀 34.02% faster execution
  • 🎯 34.76% better instruction adherence
  • 💸 87.42% cheaper operations

To evaluate slide editing performance, we present TSBench, a human-annotated benchmark with 379 diverse instructions spanning four major categories.


📚 TSBench Benchmark Dataset

📎 Download TSBench on Google Drive


🎬 Demo Videos

CamelCase Demo
CamelCase
Prompt: “Please update all English on ppt slides number 7 to camelCase formatting.”

Only English → Blue
Only English → Blue
Prompt: “Please change only English into blue color in slide number 3.”

Typo Checking Demo
Typo Checking & Correction
Prompt: “Please check ppt slides number 4 for any typos or errors, correct them.”

Translate to English
Translate to English
Prompt: “Please translate ppt slides number 5 into English.”

Slide‑Notes Script
Slide Notes Script
Prompt: “Please create a full script for ppt slides number 3 and add the script to the slide notes.”


🛠️ Installation Guide

🖥️ Recommended: Python on Windows

⚠️ To allow Python to control PowerPoint via COM interface, you must enable VBA access:

  • Open PowerPoint

  • Go to File > Options > Trust Center > Trust Center Settings

  • In Macro Settings, make sure to check:

  • ✅ “Trust access to the VBA project object model”

  1. Install dependencies:
pip install -r requirements.txt
  1. Create credentials.yml in the root directory:
gpt-4.1-mini:
  api_key:  "YOUR_OPENAI_API_KEY"
  base_url: "https://api.openai.com/v1"

gpt-4.1-nano:
  api_key:  "YOUR_OPENAI_API_KEY"
  base_url: "https://api.openai.com/v1"

gemini-1.5-flash:
  api_key: "YOUR_GEMINI_API_KEY"
  1. Create .env in the pptagent/ directory:
# Example .env content
OPENAI_API_KEY=your_key_here
  1. Run the system:
python pptagent/main.py

Tools

No tools

Comments

Recommend MCP Servers

View All MCP Servers