MCP ExplorerExplorer

Aworld

@inclusionAIon 19 days ago
254 MIT
FreeCommunity
AI Systems
#agent-swarm#agentic-ai#computer-use#gym-environment#phone-use#world-model#mcp#mcp-server
Build, evaluate and run General Multi-Agent Assistance with ease

Overview

What is Aworld

AWorld is a framework designed to build, evaluate, and run General Multi-Agent Assistance systems with ease. It allows users to create real-world scenarios or automate tasks into agentic prototypes, facilitating the development of generic agents or teams of agents.

Use cases

Use cases for AWorld include automating repetitive tasks, developing intelligent agents for customer service, creating simulations for research purposes, and building collaborative systems where multiple agents work together to achieve complex goals.

How to use

To use AWorld, install it with Python 3.11 or higher using ‘python setup.py install’. Configure your environment by setting the required API keys for AI models like OpenAI and Anthropic Claude. You can then run predefined agents using demo code provided in the repository.

Key features

Key features of AWorld include the ability to create AI-powered agents that make autonomous decisions, define the topology of multi-agent systems through swarms, and support communication among agents and tools within a defined environment. It also allows for the execution of specific tasks with associated datasets and evaluation metrics.

Where to use

AWorld can be used in various fields such as robotics, automation, artificial intelligence research, and any domain requiring multi-agent systems for task execution and decision-making.

Content

AWorld: Advancing Agentic AI

Twitter Follow
WeChat QR Code
Discord
License: MIT

News

  • 🦩 [2025/06/19] We have updated our score to 72.43 on the GAIA test. Additionally, we have introduced a new local running mode. See ./README-local.md for detailed instructions.
  • 🐳 [2025/05/22] For quick GAIA evaluation, MCP tools, AWorld, and models are now available in a single Docker image. See ./README-docker.md for instructions and youtube video for demo.
  • 🥳 [2025/05/13] AWorld has updated its state management for browser use and enhanced the video processing MCP server, achieving a score of 77.58 on GAIA validation (Pass@1 = 61.8) and maintaining its position as the top-ranked open-source framework. Learn more: GAIA leaderboard
  • ✨ [2025/04/23] AWorld ranks 3rd on GAIA benchmark (69.7 avg) with impressive Pass@1 = 58.8, 1st among open-source frameworks. Reproduce with python examples/gaia/run.py

Introduction

For self-improving, AWorld (Agent World) is designed to achieve two primary objectives: (1) provide the effiecent forward process, and (2) facilitate diverse backward processes, including but not limit foundation model training and system design meta-learning.

Forward process

1. Agent Construction 2. Topology Orchestration 3. Environments
• ✅ Support different model services
• ✅ Support MCP tools
• ✅ Support custom tools
• ✅ Encapsulate protocol between models and tools
• ✅ Encapsulate protocol among agents
• ✅ Encapsulate runtime state management
• ✅ Support state tracing
• ✅ Support distributed high-concurrency envs

Follow the instructions in ./README-local.md to run a forward process on the GAIA benchmark. Watch the demo on Youtube

Incubated backward methods

Method Category Description Key Information
Foundation Model Training Improving Function call ability of large language models Dataset
Model
Paper
Blog
Code

Want to build your own multi-agent system? Check out the detailed tutorials below to get started! ⬇️⬇️⬇️

Installation

With Python>=3.11:

pip install aworld

Usage

Quick Start

from aworld.config.conf import AgentConfig
from aworld.core.agent.base import Agent
from aworld.runner import Runners

if __name__ == '__main__':
    agent_config = AgentConfig(
        llm_provider="openai",
        llm_model_name="gpt-4o",

        # Set via environment variable or direct configuration
        # llm_api_key="YOUR_API_KEY", 
        # llm_base_url="https://api.openai.com/v1"
    )

    search = Agent(
        conf=agent_config,
        name="search_agent",
        system_prompt="You are a helpful agent.",
        mcp_servers=["amap-amap-sse"] # MCP server name for agent to use
    )

    # Run agent
    Runners.sync_run(input="Hotels within 1 kilometer of West Lake in Hangzhou",
                     agent=search)

Here is a MCP server config example.

Running Pre-defined Agents (demo code)

Below are demonstration videos showcasing AWorld’s capabilities across different agent configurations and environments.

Mode Type Demo
Single Agent Browser use AWorld Browser Demo on YouTube

▶️ Watch Browser Demo on YouTube

Phone use AWorld Mobile Demo on YouTube

▶️ Watch Mobile Demo on YouTube

Multi Agent Cooperative Teams AWorld Travel Demo on YouTube

▶️ Watch Travel Demo on YouTube

Competitive Teams AWorld Debate Demo on YouTube

▶️ Watch Debate Arena on YouTube

Mixed of both Teams Coming Soon 🚀

or Creating Your Own Agents (Quick Start Tutorial)

Here is a multi-agent example of running a level2 task from the GAIA benchmark:

from examples.plan_execute.agent import PlanAgent, ExecuteAgent
from examples.tools.common import Agents, Tools
from aworld.core.agent.swarm import Swarm
from aworld.core.task import Task
from aworld.config.conf import AgentConfig, TaskConfig
from aworld.dataset.mock import mock_dataset
from aworld.runner import Runners

import os

# Need OPENAI_API_KEY
os.environ['OPENAI_API_KEY'] = "your key"
# Optional endpoint settings, default `https://api.openai.com/v1`
# os.environ['OPENAI_ENDPOINT'] = "https://api.openai.com/v1"

# One sample for example
test_sample = mock_dataset("gaia")

# Create agents
plan_config = AgentConfig(
    name=Agents.PLAN.value,
    llm_provider="openai",
    llm_model_name="gpt-4o",
)
agent1 = PlanAgent(conf=plan_config)

exec_config = AgentConfig(
    name=Agents.EXECUTE.value,
    llm_provider="openai",
    llm_model_name="gpt-4o",
)
agent2 = ExecuteAgent(conf=exec_config, tool_names=[Tools.DOCUMENT_ANALYSIS.value])

# Create swarm for multi-agents
# define (head_node, tail_node) edge in the topology graph
# NOTE: the correct order is necessary
swarm = Swarm((agent1, agent2), sequence=False)

# Define a task
task = Task(input=test_sample, swarm=swarm, conf=TaskConfig())

# Run task
result = Runners.sync_run_task(task=task)

print(f"Time cost: {result['time_cost']}")
print(f"Task Answer: {result['task_0']['answer']}")
Time cost: 26.431413888931274
Task Answer: Time-Parking 2: Parallel Universe

Framework Architecture

AWorld uses a client-server architecture with three main components:

  1. Client-Server Architecture: Similar to ray, this architecture:

    • Decouples agents and environments for better scalability and flexibility
    • Provides a unified interaction protocol for all agent-environment interactions
  2. Agent/Actor:

    • Encapsulates system prompts, tools, mcp servers, and models with the capability to hand off execution to other agents
    Field Type Description
    id string Unique identifier for the agent
    name string Name of the agent
    model_name string LLM model name of the agent
    _llm object LLM model instance based on model_name (e.g., “gpt-4”, “claude-3”)
    conf BaseModel Configuration inheriting from pydantic BaseModel
    trajectory object Memory for maintaining context across interactions
    tool_names list List of tools the agent can use
    mcp_servers list List of mcp servers the agent can use
    handoffs list Agent as tool; list of other agents the agent can delegate tasks to
    finished bool Flag indicating whether the agent has completed its task
  3. Environment/World Model: Various tools and models in the environment

    • MCP servers
    • Computer interfaces (browser, shell, functions, etc.)
    • World Model
    Tools Description
    mcp Servers AWorld seamlessly integrates a rich collection of MCP servers as agent tools
    browser Controls web browsers for navigation, form filling, and interaction with web pages
    android Manages Android device simulation for mobile app testing and automation
    shell Executes shell commands for file operations and system interactions
    code Runs code snippets in various languages for data processing and automation
    search Performs web searches and returns structured results for information gathering and summary
    document Handles file operations including reading, writing, and managing directories

Dual Purpose Framework

AWorld serves two complementary purposes:

Agent Evaluation

  • Unified task definitions to run both customized and public benchmarks
  • Efficient and stable execution environment
  • Detailed test reports measuring efficiency (steps to completion), completion rates, token costs, ect.

Agent Training

  • Agent models improve to overcome challenges from env
  • World models (environments) evolve to present new, more complex scenarios

🔧 Key Features

  • MCP Servers as Tools - Powerful integration of MCP servers providing robust tooling capabilities

  • 🌐 Environment Multi-Tool Support:

    • [x] Default computer-use tools; (browser, shell, code, APIs, file system, etc.)
    • [x] Android device simulation
    • [ ] Cloud sandbox for quick and stable deployment
    • [ ] Reward model as env simulation
  • 🤖 AI-Powered Agents:

    • [x] Agent initialization
    • [x] Delegation between multiple agents
    • [ ] Asynchronous delegation
    • [ ] Human delegation (e.g., for password entry)
    • [ ] Pre-deployed open source LLMs powered by state-of-the-art inference frameworks
  • 🎛️ Web Interface:

    • [ ] UI for execution visualization
    • [ ] Server configuration dashboard
    • [ ] Real-time monitoring tools
    • [ ] Performance reporting
  • 🧠 Benchmarks and Samples:

    • [ ] Support standardized benchmarks by default, e.g., GAIA, WebArena
    • [ ] Support customized benchmarks
    • [ ] Support generating training samples

Contributing

We warmly welcome developers to join us in building and improving AWorld! Whether you’re interested in enhancing the framework, fixing bugs, or adding new features, your contributions are valuable to us.

For academic citations or wish to contact us, please use the following BibTeX entry:

@software{aworld2025,
  author = {Agent Team at Ant Group},
  title = {AWorld: A Unified Agent Playground for Computer and Phone Use Tasks},
  year = {2025},
  url = {https://github.com/inclusionAI/AWorld},
  version = {0.1.0},
  publisher = {GitHub},
  email = {chenyi.zcy at antgroup.com}
}

License

This project is licensed under the MIT License - see the LICENSE file for details.

Tools

No tools

Comments