- Explore MCP Servers
- selenium-mcp
Selenium Mcp
What is Selenium Mcp
selenium-mcp is a Java implementation of the Model Context Protocol (MCP) designed for Selenium WebDriver, enabling browser automation through standardized MCP clients.
Use cases
Use cases for selenium-mcp include automated testing of web applications, performing data extraction from websites, and executing browser-based tasks triggered by AI assistants.
How to use
To use selenium-mcp, clone the repository, build it using Maven, and start the MCP Selenium server from the command line. You can then send JSON commands to the server for browser automation tasks.
Key features
Key features of selenium-mcp include a standardized interface for browser automation, compatibility with AI assistants, and support for multiple browsers like Chrome and Firefox.
Where to use
selenium-mcp can be used in various fields such as software testing, web scraping, and automating repetitive browser tasks in web applications.
Clients Supporting MCP
The following are the main client software that supports the Model Context Protocol. Click the link to visit the official website for more information.
Overview
What is Selenium Mcp
selenium-mcp is a Java implementation of the Model Context Protocol (MCP) designed for Selenium WebDriver, enabling browser automation through standardized MCP clients.
Use cases
Use cases for selenium-mcp include automated testing of web applications, performing data extraction from websites, and executing browser-based tasks triggered by AI assistants.
How to use
To use selenium-mcp, clone the repository, build it using Maven, and start the MCP Selenium server from the command line. You can then send JSON commands to the server for browser automation tasks.
Key features
Key features of selenium-mcp include a standardized interface for browser automation, compatibility with AI assistants, and support for multiple browsers like Chrome and Firefox.
Where to use
selenium-mcp can be used in various fields such as software testing, web scraping, and automating repetitive browser tasks in web applications.
Clients Supporting MCP
The following are the main client software that supports the Model Context Protocol. Click the link to visit the official website for more information.
Content
MCP Selenium
A Java implementation of the Model Context Protocol (MCP) for Selenium WebDriver, enabling browser automation through standardized MCP clients.
Overview
MCP Selenium provides a bridge between the Model Context Protocol and Selenium WebDriver. It allows AI assistants and other MCP-compatible clients to perform browser automation tasks using a standardized set of tools.
Project Structure
selenium-mcp/ ├── src/ │ └── main/ │ └── java/ │ └── io/ │ └── github/ │ └── naveenautomation/ │ └── mcpselenium/ │ ├── McpSeleniumServer.java # The main MCP server implementation │ ├── McpSeleniumLauncher.java # Launcher script for the server │ └── AdvancedMcpClient.java # GUI client for testing ├── target/ │ └── mcp-selenium-0.1.0-jar-with-dependencies.jar # Compiled JAR with dependencies └── pom.xml # Maven project configuration
Prerequisites
- Java 11 or higher
- Maven (for building)
- Chrome or Firefox browser installed
Building the Project
# Clone the repository
git clone https://github.com/naveenautomation/selenium-mcp.git
cd selenium-mcp
# Build with Maven
mvn clean package
This will create the JAR file at target/mcp-selenium-0.1.0-jar-with-dependencies.jar.
Usage Options
Option 1: Command Line Usage
Starting the Server
Start the MCP Selenium server from the command line:
java -jar target/mcp-selenium-0.1.0-jar-with-dependencies.jar
The server will start and wait for commands on standard input. You’ll see the server information as initial output.
Sending Commands
You can now send JSON commands to the server. Type or paste the commands directly into the terminal, one per line.
Example Commands
- Start a Chrome browser:
{
"type": "tool_call",
"tool_call_id": "call-1",
"name": "start_browser",
"params": {
"browser": "chrome",
"options": {
"headless": false
}
}
}
- Navigate to a website:
{
"type": "tool_call",
"tool_call_id": "call-2",
"name": "navigate",
"params": {
"url": "https://www.example.com"
}
}
- Find an element:
{
"type": "tool_call",
"tool_call_id": "call-3",
"name": "find_element",
"params": {
"by": "css",
"value": "h1"
}
}
- Get element text:
{
"type": "tool_call",
"tool_call_id": "call-4",
"name": "get_element_text",
"params": {
"by": "css",
"value": "h1"
}
}
- Click an element:
{
"type": "tool_call",
"tool_call_id": "call-5",
"name": "click_element",
"params": {
"by": "id",
"value": "submit-button"
}
}
- Type text into an element:
{
"type": "tool_call",
"tool_call_id": "call-6",
"name": "send_keys",
"params": {
"by": "id",
"value": "search-box",
"text": "search query"
}
}
- Take a screenshot:
{
"type": "tool_call",
"tool_call_id": "call-7",
"name": "take_screenshot",
"params": {
"outputPath": "screenshot.png"
}
}
- Close the browser:
{
"type": "tool_call",
"tool_call_id": "call-8",
"name": "close_session",
"params": {}
}
Closing the Server
To stop the server, use Ctrl+C in the terminal.
Option 2: GUI Client
For a more user-friendly experience, you can use the included GUI client.
Starting the GUI Client
Compile and run the client:
# Compile the client
javac -d target/classes src/main/java/io/github/naveenautomation/mcpselenium/AdvancedMcpClient.java
# Run the client
java -cp target/classes io.github.naveenautomation.mcpselenium.AdvancedMcpClient
Using the GUI
The GUI client automatically starts the MCP Selenium server when launched. The interface includes:
- Command Selector: Dropdown menu to select the MCP command you want to execute.
- JSON Parameters: Text field to enter command parameters in JSON format.
- Quick Action Buttons:
- Start Chrome: Launches Chrome browser
- Navigate: Opens a dialog to enter a URL
- Screenshot: Takes a screenshot
- Close Browser: Closes the current session
- Log Area: Displays server responses and logs.
Examples of GUI Usage
Find Element
- Select “find_element” from the command dropdown.
- Enter parameters in the JSON field:
{
"by": "id",
"value": "login-button"
}
- Click “Send”.
Send Keys (Type Text)
- Select “send_keys” from the command dropdown.
- Enter parameters in the JSON field:
{
"by": "id",
"value": "username",
"text": "[email protected]"
}
- Click “Send”.
Click Element
- Select “click_element” from the command dropdown.
- Enter parameters in the JSON field:
{
"by": "css",
"value": "button.submit"
}
- Click “Send”.
Supported Commands
The MCP Selenium Server supports the following commands:
| Command | Description | Required Parameters | Optional Parameters |
|---|---|---|---|
start_browser |
Launches a browser | browser (“chrome” or “firefox”) |
options.headless, options.arguments |
navigate |
Navigates to a URL | url |
- |
find_element |
Finds an element | by, value |
timeout |
click_element |
Clicks an element | by, value |
timeout |
send_keys |
Types text into an element | by, value, text |
timeout |
get_element_text |
Gets text from an element | by, value |
timeout |
hover |
Hovers over an element | by, value |
timeout |
drag_and_drop |
Drags and drops an element | by, value, targetBy, targetValue |
timeout |
double_click |
Double-clicks an element | by, value |
timeout |
right_click |
Right-clicks an element | by, value |
timeout |
press_key |
Presses a keyboard key | key |
- |
upload_file |
Uploads a file | by, value, filePath |
timeout |
take_screenshot |
Takes a screenshot | - | outputPath |
close_session |
Closes the browser | - | - |
Locator Strategies
For commands that interact with elements, use these locator strategies in the by parameter:
id: Element IDcss: CSS selectorxpath: XPath expressionname: Element nametag: HTML tag nameclass: CSS class name
Integration with AI Systems
MCP Selenium is designed to be used with AI systems that support the Model Context Protocol. To integrate with an AI assistant like Claude:
- Start the MCP Selenium server
- Configure the AI system to connect to the server via stdin/stdout
- Send natural language commands to the AI, which will translate them to MCP commands
Troubleshooting
Common Issues
-
Browser Not Starting:
- Ensure you have Chrome or Firefox installed
- Try using
"headless":truein options - Check server logs for detailed error messages
-
Element Not Found:
- Verify your locator (by and value)
- Increase the timeout value
- Check if the element is in an iframe
-
Server Not Responding:
- Ensure the JSON format is correct
- Check that each command has a unique tool_call_id
- Restart the server if it becomes unresponsive
-
Screenshot Not Saving:
- Provide an absolute file path
- Ensure the directory exists
- Check file permissions
License
This project is licensed under the MIT License.
Acknowledgements
- Built on Selenium WebDriver for browser automation
- Implements the Model Context Protocol standard
Dev Tools Supporting MCP
The following are the main code editors that support the Model Context Protocol. Click the link to visit the official website for more information.










