Secure Hulk

1 MIT

FreeCommunity

AI Systems

Secure-Hulk is a security scanner for Model Context Protocol (MCP) servers and tools. It helps identify potential security vulnerabilities in MCP configurations, such as prompt injection, tool poisoning, cross-origin escalation, data exfiltration, and toxic agent flows.

What is Secure Hulk

Secure-Hulk is a security scanner designed specifically for Model Context Protocol (MCP) servers and tools. It identifies potential security vulnerabilities in MCP configurations, including issues like prompt injection, tool poisoning, cross-origin escalation, data exfiltration, and toxic agent flows.

Use cases

Use cases for Secure-Hulk include scanning configuration files for vulnerabilities before deployment, monitoring existing MCP servers for security threats, and generating reports for compliance and security audits. It can also be used to enhance content moderation processes.

How to use

To use Secure-Hulk, install it via npm, then run scans on MCP configuration files using commands like ‘secure-hulk scan /path/to/config.json’. You can generate HTML reports, enable verbose output, and use the OpenAI Moderation API for enhanced harmful content detection.

Key features

Key features of Secure-Hulk include scanning for security vulnerabilities in MCP configurations, detecting prompt injection and tool poisoning, checking for cross-origin escalation risks, monitoring data exfiltration attempts, and identifying toxic agent flows. It also supports privilege escalation detection and generates HTML reports.

Where to use

Secure-Hulk is applicable in various fields where MCP servers are utilized, including software development, cybersecurity, and data management. It is particularly useful for organizations that need to ensure the security of their MCP configurations.

Overview

What is Secure Hulk

Use cases

How to use

Key features

Where to use

Content

Secure-Hulk

Security scanner for Model Context Protocol servers and tools.

Overview

Features

Scan MCP configurations for security vulnerabilities
Detect prompt injection attempts
Identify tool poisoning vulnerabilities
Check for cross-origin escalation risks
Monitor for data exfiltration attempts
Detect toxic agent flows - Multi-step attacks that manipulate agents into unintended actions
Privilege escalation detection - Identify attempts to escalate from public to private access
Cross-resource attack detection - Monitor suspicious access patterns across multiple resources
Indirect prompt injection detection - Catch attacks through external content processing
Generate HTML reports of scan results
Whitelist approved entities

Installation

npm install
npm run build

Usage

Scanning MCP Configurations

# Scan well-known MCP configuration paths
npm i secure-hulk

# Scan specific configuration files
secure-hulk scan /path/to/config.json

# Generate HTML report
secure-hulk scan --html report.html /path/to/config.json

# Enable verbose output
secure-hulk scan -v /path/to/config.json

# Output results in JSON format
secure-hulk scan -j /path/to/config.json

Using OpenAI Moderation API for Harmful Content Detection

Secure-Hulk now supports using OpenAI’s Moderation API to detect harmful content in entity descriptions. This provides a more robust detection mechanism for identifying potentially harmful, unsafe, or unethical content.

To use the OpenAI Moderation API:

secure-hulk scan --use-openai-moderation --openai-api-key YOUR_API_KEY /path/to/config.json

Options:

--use-openai-moderation: Enable OpenAI Moderation API for prompt injection detection
--openai-api-key <key>: Your OpenAI API key
--openai-moderation-model <model>: OpenAI Moderation model to use (default: ‘omni-moderation-latest’)

The OpenAI Moderation API provides several advantages:

More accurate detection: The API uses advanced AI models to detect harmful content, which can catch subtle harmful content that pattern matching might miss.
Categorized results: The API provides detailed categories for flagged content (hate, harassment, self-harm, sexual content, violence, etc.), helping you understand the specific type of harmful content detected.
Confidence scores: Each category includes a confidence score, allowing you to set appropriate thresholds for your use case.
Regular updates: The API is regularly updated to detect new types of harmful content as OpenAI’s policies evolve.

The API can detect content in these categories:

Hate speech
Harassment
Self-harm
Sexual content
Violence
Illegal activities
Deception

If the OpenAI Moderation API check fails for any reason, Secure-Hulk will automatically fall back to pattern-based detection for prompt injection vulnerabilities.

Using Hugging Face Safety Models for Content Detection

Secure-Hulk now supports Hugging Face safety models for advanced AI-powered content moderation. This provides additional options beyond OpenAI’s Moderation API, including open-source models and specialized toxicity detection.

To use Hugging Face safety models:

secure-hulk scan --use-huggingface-guardrails --huggingface-api-token YOUR_HF_TOKEN /path/to/config.json

Options:

--use-huggingface-guardrails: Enable Hugging Face safety models for content detection
--huggingface-api-token <token>: Your Hugging Face API token
--huggingface-model <model>: Specific model to use (default: ‘unitary/toxic-bert’)
--huggingface-threshold <threshold>: Confidence threshold for flagging content (default: 0.5)
--huggingface-preset <preset>: Use preset configurations: ‘toxicity’, ‘hate-speech’, ‘multilingual’, ‘strict’
--huggingface-timeout <milliseconds>: Timeout for API calls (default: 10000)

Available models include:

unitary/toxic-bert: General toxicity detection (recommended default)
s-nlp/roberta_toxicity_classifier: High-sensitivity toxicity detection
unitary/unbiased-toxic-roberta: Bias-reduced toxicity detection

Preset configurations:

toxicity: General purpose toxicity detection
strict: High sensitivity for maximum safety

Example with multiple guardrails:

secure-hulk scan \
  --use-openai-moderation --openai-api-key YOUR_OPENAI_KEY \
  --use-huggingface-guardrails --huggingface-preset toxicity --huggingface-api-token YOUR_HF_TOKEN \
  --use-nemo-guardrails --nemo-guardrails-config-path ./guardrails-config \
  /path/to/config.json

The Hugging Face integration provides several advantages:

Model diversity: Choose from multiple specialized safety models
Open-source options: Use community-developed models
Customizable thresholds: Fine-tune sensitivity for your use case
Specialized detection: Models focused on specific types of harmful content
Cost flexibility: Various pricing options including free tiers

If the Hugging Face API check fails for any reason, Secure-Hulk will log the error and continue with other security checks.

Inspecting MCP Configurations

secure-hulk inspect /path/to/config.json

Managing the Whitelist

# Add an entity to the whitelist
secure-hulk whitelist tool "Calculator" abc123

# Print the whitelist
secure-hulk whitelist

# Reset the whitelist
secure-hulk whitelist --reset

Configuration

Scan Options

--json, -j: Output results in JSON format
--verbose, -v: Enable verbose output
--html <path>: Generate HTML report and save to specified path
--storage-file <path>: Path to store scan results and whitelist information
--server-timeout <seconds>: Seconds to wait before timing out server connections
--checks-per-server <number>: Number of times to check each server
--suppress-mcpserver-io <boolean>: Suppress stdout/stderr from MCP servers

Whitelist Options

--storage-file <path>: Path to store scan results and whitelist information
--reset: Reset the entire whitelist
--local-only: Only update local whitelist, don’t contribute to global whitelist

Secure Hulk

What is Secure Hulk

Use cases

How to use

Key features

Where to use

Overview

What is Secure Hulk

Use cases

How to use

Key features

Where to use

Content

Secure-Hulk

Overview

Features

Installation

Usage

Configuration

Sponsors

Proudly sponsored by LambdaTest

License

Tools

Comments