- Explore MCP Servers
- ostruct
Ostruct
What is Ostruct
ostruct is a command-line interface (CLI) tool designed to generate structured JSON from unstructured text inputs using OpenAI models. It utilizes dynamic templates powered by Jinja2 and integrates various tools for enhanced functionality.
Use cases
Use cases for ostruct include automating data extraction from logs, generating structured reports from raw data, and facilitating real-time data access and analysis through integrated tools.
How to use
To use ostruct, you input unstructured data such as text files or CSVs, along with any necessary variables. The tool processes this data using dynamic prompt templates and outputs structured JSON according to a specified JSON schema.
Key features
Key features of ostruct include multi-tool integration (Code Interpreter, File Search, Web Search, MCP), dynamic template support, schema validation, and the ability to handle diverse data formats without the need for custom parsers.
Where to use
ostruct can be used in data analysis, software development, and any field that requires transforming unstructured data into structured formats for better usability and analysis.
Clients Supporting MCP
The following are the main client software that supports the Model Context Protocol. Click the link to visit the official website for more information.
Overview
What is Ostruct
ostruct is a command-line interface (CLI) tool designed to generate structured JSON from unstructured text inputs using OpenAI models. It utilizes dynamic templates powered by Jinja2 and integrates various tools for enhanced functionality.
Use cases
Use cases for ostruct include automating data extraction from logs, generating structured reports from raw data, and facilitating real-time data access and analysis through integrated tools.
How to use
To use ostruct, you input unstructured data such as text files or CSVs, along with any necessary variables. The tool processes this data using dynamic prompt templates and outputs structured JSON according to a specified JSON schema.
Key features
Key features of ostruct include multi-tool integration (Code Interpreter, File Search, Web Search, MCP), dynamic template support, schema validation, and the ability to handle diverse data formats without the need for custom parsers.
Where to use
ostruct can be used in data analysis, software development, and any field that requires transforming unstructured data into structured formats for better usability and analysis.
Clients Supporting MCP
The following are the main client software that supports the Model Context Protocol. Click the link to visit the official website for more information.
Content
02:07 A.M. CI Failure
"Build failed. Again. The third time this week our data extraction pipeline broke because someone changed the log format, and our regex-based parser couldn’t handle the new structure. Sarah’s on vacation, Mike’s parsing code is unreadable, and the client wants their analytics dashboard working by morning.
There has to be a better way to turn messy data into structured JSON without writing custom parsers for every format change…"

ostruct transforms unstructured inputs into structured, usable JSON output using OpenAI APIs with multi-tool integration
The better way you’ve been looking for.
ostruct-cli
ostruct processes unstructured data (text files, code, CSVs, etc.), input variables, and dynamic prompt templates to produce structured JSON output defined by a JSON schema. With enhanced multi-tool integration, ostruct now supports Code Interpreter for data analysis, File Search for document retrieval, Web Search for real-time information access, and MCP (Model Context Protocol) servers for extended capabilities.

Why ostruct?
LLMs are powerful, but getting consistent, structured output from them can be challenging. ostruct solves this problem by providing a streamlined approach to transform unstructured data into reliable JSON structures. The motivation behind creating ostruct was to:
- Bridge the gap between freeform LLM capabilities and structured data needs in production systems
- Simplify integration of AI into existing workflows and applications that expect consistent data formats
- Ensure reliability and validate output against a defined schema to avoid unexpected formats or missing data
- Reduce development time by providing a standardized way to interact with OpenAI models for structured outputs
- Enable non-developers to leverage AI capabilities through a simple CLI interface with templates
Real-World Use Cases
ostruct can be used for various scenarios, including:
Automated Code Review with Multi-Tool Analysis
# Traditional pattern matching
ostruct run prompts/task.j2 schemas/code_review.json -p source "examples/security/*.py"
# Enhanced with Code Interpreter for deeper analysis
ostruct run prompts/task.j2 schemas/code_review.json -fc examples/security/ -fs documentation/
Analyze code for security vulnerabilities, style issues, and performance problems. The enhanced version uses Code Interpreter for execution analysis and File Search for documentation context.
Security Vulnerability Scanning
# Budget-friendly static analysis (recommended for most projects)
ostruct run prompts/static_analysis.j2 schemas/scan_result.json \
-d code examples -R --sys-file prompts/system.txt
# Professional security analysis with Code Interpreter (best balance)
ostruct run prompts/code_interpreter.j2 schemas/scan_result.json \
-dc examples --sys-file prompts/system.txt
# Comprehensive hybrid analysis for critical applications
ostruct run prompts/hybrid_analysis.j2 schemas/scan_result.json \
-d code examples -R -dc examples --sys-file prompts/system.txt
Three optimized approaches for automated security vulnerability scanning:
- Static Analysis: $0.18 cost, fast processing, comprehensive vulnerability detection
- Code Interpreter: $0.18 cost (same!), superior analysis quality with evidence-based findings
- Hybrid Analysis: $0.20 cost (+13%), maximum depth with cross-validation
Each approach finds the same core vulnerabilities but with different levels of detail and analysis quality. Directory-based analysis provides comprehensive project coverage in a single scan.
Data Analysis with Code Interpreter
# Upload data for analysis and visualization
ostruct run analysis.j2 schemas/analysis_result.json \
-fc sales_data.csv -fc customer_data.json \
-fs reports/ -ft config.yaml
Perform sophisticated data analysis using Python execution, generate visualizations, and create comprehensive reports with document context.
Configuration Validation & Analysis
# Traditional file comparison
ostruct run prompts/task.j2 schemas/validation_result.json \
-f dev examples/basic/dev.yaml -f prod examples/basic/prod.yaml
# Enhanced with environment context
ostruct run prompts/task.j2 schemas/validation_result.json \
-ft dev.yaml -ft prod.yaml -fs infrastructure_docs/
Validate configuration files across environments with documentation context for better analysis and recommendations.
Oh, and also, among endless other use cases:
Etymology Analysis
ostruct run prompts/task.j2 schemas/etymology.json -ft examples/scientific.txt
Break down words into their components, showing their origins, meanings, and hierarchical relationships. Useful for linguistics, educational tools, and understanding terminology in specialized fields.
Features
Core Capabilities
- Generate structured JSON output defined by dynamic prompts using OpenAI models and JSON schemas
- Rich template system for defining prompts (Jinja2-based)
- Automatic token counting and context window management
- Streaming support for real-time output
- Secure handling of sensitive data with comprehensive path validation
- Automatic prompt optimization and token management
Multi-Tool Integration
- Code Interpreter: Upload and analyze data files, execute Python code, generate visualizations
- File Search: Vector-based document search and retrieval from uploaded files
- Web Search: Real-time information retrieval and current data access via OpenAI’s web search tool
- MCP Servers: Connect to Model Context Protocol servers for extended functionality
- Explicit File Routing: Route different files to specific tools for optimized processing
Advanced Features
- Configuration System: YAML-based configuration with environment variable support
- Unattended Operation: Designed for CI/CD and automation scenarios
- Progress Reporting: Real-time progress updates with clear, user-friendly messaging
- Model Registry: Dynamic model management with support for latest OpenAI models
Requirements
- Python 3.10 or higher
Installation
We provide multiple installation methods to suit different user needs. Choose the one that’s right for you.
Recommended: pipx (Python Users)
For users who have Python installed, pipx is the recommended installation method. It installs ostruct in an isolated environment, preventing conflicts with other Python packages.
-
Install pipx:
python3 -m pip install --user pipx python3 -m pipx ensurepath(Restart your terminal after running
ensurepathto update yourPATH) -
Install ostruct-cli:
pipx install ostruct-cli
macOS: Homebrew
If you’re on macOS and use Homebrew, you can install ostruct with a single command:
brew install yaniv-golan/ostruct/ostruct-cli
Standalone Binaries (No Python Required)
We provide pre-compiled .zip archives for macOS, Windows, and Linux that do not require Python to be installed.
-
Go to the Latest Release page.
-
Download the
.zipfile for your operating system (e.g.,ostruct-macos-latest.zip,ostruct-windows-latest.zip,ostruct-ubuntu-latest.zip). -
Extract the
.zipfile. This will create a folder (e.g.,ostruct-macos-amd64). -
On macOS/Linux, make the executable inside the extracted folder runnable:
chmod +x /path/to/ostruct-macos-amd64/ostruct -
Run the executable from within the extracted folder, as it depends on bundled libraries in the same directory.
Docker
If you prefer to use Docker, you can run ostruct from our official container image available on the GitHub Container Registry.
docker run -it --rm \
-v "$(pwd)":/app \
-w /app \
ghcr.io/yaniv-golan/ostruct:latest \
run template.j2 schema.json -ft input.txt
This command mounts the current directory into the container and runs ostruct.
Uninstallation
To uninstall ostruct, use the method corresponding to how you installed it:
- pipx:
pipx uninstall ostruct-cli - Homebrew:
brew uninstall ostruct-cli - Binaries: Simply delete the binary file.
- Docker: No uninstallation is needed for the image itself, but you can remove it with
docker rmi ghcr.io/yaniv-golan/ostruct:latest.
Manual Installation
For Users
To install the latest stable version from PyPI:
pip install ostruct-cli
Note: If the ostruct command isn’t found after installation, you may need to add Python’s user bin directory to your PATH. See the troubleshooting guide for details.
For Developers
If you plan to contribute to the project, see the Development Setup section below for instructions on setting up the development environment with Poetry.
Environment Variables
ostruct-cli respects the following environment variables:
OPENAI_API_KEY: Your OpenAI API key (required unless provided via command line)OPENAI_API_BASE: Custom API base URL (optional)OPENAI_API_VERSION: API version to use (optional)OPENAI_API_TYPE: API type (e.g., “azure”) (optional)OSTRUCT_DISABLE_REGISTRY_UPDATE_CHECKS: Set to “1”, “true”, or “yes” to disable automatic registry update checksMCP_<NAME>_URL: Custom MCP server URLs (e.g.,MCP_STRIPE_URL=https://mcp.stripe.com)
💡 Tip: ostruct automatically loads .env files from the current directory. Environment variables take precedence over .env file values.
Shell Completion Setup (Click to expand)
ostruct-cli supports shell completion for Bash, Zsh, and Fish shells. To enable it:
Bash
Add this to your ~/.bashrc:
eval "$(_OSTRUCT_COMPLETE=bash_source ostruct)"
Zsh
Add this to your ~/.zshrc:
eval "$(_OSTRUCT_COMPLETE=zsh_source ostruct)"
Fish
Add this to your ~/.config/fish/completions/ostruct.fish:
eval (env _OSTRUCT_COMPLETE=fish_source ostruct)
After adding the appropriate line, restart your shell or source the configuration file.
Shell completion will help you with:
- Command options and their arguments
- File paths for template and schema files
- Directory paths for
-dand--base-diroptions - And more!
Enhanced CLI with Multi-Tool Integration
Migration Notice
ostruct now includes powerful multi-tool integration while maintaining full backward compatibility. All existing commands continue to work exactly as before, but you can now take advantage of:
- Code Interpreter for data analysis and visualization
- File Search for document retrieval
- Web Search for real-time information access
- MCP Servers for extended functionality
- Explicit File Routing for optimized processing
New File Routing Options (Click to expand)
Basic File Routing (Explicit Tool Assignment)
# Template access only (config files, small data)
ostruct run template.j2 schema.json -ft config.yaml
# Code Interpreter (data analysis, code execution)
ostruct run analysis.j2 schema.json -fc data.csv
# File Search (document retrieval)
ostruct run search.j2 schema.json -fs documentation.pdf
# Web Search (real-time information)
ostruct run research.j2 schema.json --enable-tool web-search -V topic="latest AI developments"
# Multiple tools with one file
ostruct run template.j2 schema.json --file-for code-interpreter shared.json --file-for file-search shared.json
Directory Routing
ostruct provides two directory routing patterns to match different use cases:
Auto-Naming Pattern (for known directory structures):
# Variables are auto-generated from directory contents
ostruct run template.j2 schema.json -dt ./config -dc ./datasets -ds ./docs
# Creates variables like: config_yaml, datasets_csv, docs_pdf (based on actual files)
Alias Pattern (for generic, reusable templates):
# Create stable variable names regardless of directory contents
ostruct run template.j2 schema.json --dta app_config ./config --dca data ./datasets --dsa knowledge ./docs
# Creates stable variables: app_config, data, knowledge (always these names)
When to Use Each Pattern:
- Use auto-naming (
-dt,-dc,-ds) when your template knows the specific directory structure - Use alias syntax (
--dta,--dca,--dsa) when your template is generic and needs stable variable names
Template Example:
{# Works with alias pattern - variables are predictable #}
{% for file in app_config %}
Configuration: {{ file.name }} = {{ file.content }}
{% endfor %}
{# Analysis data from stable variable name #}
{% for file in data %}
Processing: {{ file.path }}
{% endfor %}
This design pattern makes templates reusable across different projects while maintaining full backward compatibility.
MCP Server Integration
# Connect to MCP servers for extended capabilities
ostruct run template.j2 schema.json --mcp-server deepwiki@https://mcp.deepwiki.com/sse
Configuration System
Create an ostruct.yaml file for persistent settings:
models:
default: gpt-4o
tools:
code_interpreter:
auto_download: true
output_directory: "./output"
download_strategy: "two_pass_sentinel" # Enable reliable file downloads
mcp:
custom_server: "https://my-mcp-server.com"
limits:
max_cost_per_run: 10.00
Load custom configuration:
ostruct --config my-config.yaml run template.j2 schema.json
Code Interpreter File Downloads
Important: If you’re using Code Interpreter with structured output (JSON schemas), you may need to enable the two-pass download strategy to ensure files are downloaded reliably.
Option 1: CLI Flags (Recommended for one-off usage)
# Enable reliable file downloads for this run
ostruct run template.j2 schema.json -fc data.csv --enable-feature ci-download-hack
# Force single-pass mode (override config)
ostruct run template.j2 schema.json -fc data.csv --disable-feature ci-download-hack
Option 2: Configuration File (Recommended for persistent settings)
# ostruct.yaml
tools:
code_interpreter:
download_strategy: "two_pass_sentinel" # Enables reliable file downloads
auto_download: true
output_directory: "./downloads"
Why this is needed: OpenAI’s structured output mode can prevent file download annotations from being generated. The two-pass strategy works around this by making two API calls: one to generate files (without structured output), then another to ensure schema compliance. For detailed technical information, see docs/known-issues/2025-06-responses-ci-file-output.md.
Performance: The two-pass strategy approximately doubles token usage but ensures reliable file downloads when using structured output with Code Interpreter.
Get Started Quickly
🚀 New to ostruct? Follow our step-by-step quickstart guide featuring Juno the beagle for a hands-on introduction.
📝 Template Scripting: Learn ostruct’s templating capabilities with the template scripting guide - no prior Jinja2 knowledge required!
📖 Full Documentation: https://ostruct.readthedocs.io/
Quick Start
- Set your OpenAI API key:
# Environment variable
export OPENAI_API_KEY=your-api-key
# Or create a .env file
echo 'OPENAI_API_KEY=your-api-key' > .env
Example 1: Basic Text Extraction (Simplest)
- Create a template file
extract_person.j2:
Extract information about the person from this text: {{ stdin }}
- Create a schema file
schema.json:
{
"type": "object",
"properties": {
"person": {
"type": "object",
"properties": {
"name": {
"type": "string",
"description": "The person's full name"
},
"age": {
"type": "integer",
"description": "The person's age"
},
"occupation": {
"type": "string",
"description": "The person's job"
}
},
"required": [
"name",
"age",
"occupation"
],
"additionalProperties": false
}
},
"required": [
"person"
],
"additionalProperties": false
}
- Run the CLI:
# Basic usage
echo "John Smith is a 35-year-old software engineer" | ostruct run extract_person.j2 schema.json
# With enhanced options
echo "John Smith is a 35-year-old software engineer" | \
ostruct run extract_person.j2 schema.json \
--model gpt-4o \
--temperature 0.7
Example 2: Multi-Tool Data Analysis
- Create an analysis template
analysis_template.j2:
Analyze the following data sources:
{% if sales_data_csv is defined %}
Sales Data: {{ sales_data_csv.name }} ({{ sales_data_csv.size }} bytes)
{% endif %}
{% if customer_data_json is defined %}
Customer Data: {{ customer_data_json.name }} ({{ customer_data_json.size }} bytes)
{% endif %}
{% if market_reports_pdf is defined %}
Market Reports: {{ market_reports_pdf.name }} ({{ market_reports_pdf.size }} bytes)
{% endif %}
{% if config_yaml is defined %}
Configuration: {{ config_yaml.content }}
{% endif %}
Provide comprehensive analysis and actionable insights.
- Create an analysis schema
analysis_schema.json:
{
"type": "object",
"properties": {
"analysis": {
"type": "object",
"properties": {
"insights": {
"type": "string",
"description": "Key insights from the data"
},
"recommendations": {
"type": "array",
"items": {
"type": "string"
},
"description": "Actionable recommendations"
},
"data_quality": {
"type": "string",
"description": "Assessment of data quality"
}
},
"required": [
"insights",
"recommendations"
],
"additionalProperties": false
}
},
"required": [
"analysis"
],
"additionalProperties": false
}
- For more complex scenarios, use explicit file routing with flexible syntax options:
# Auto-naming (fastest for one-off analysis)
ostruct run analysis_template.j2 analysis_schema.json \
-fc sales_data.csv \
-fc customer_data.json \
-fs market_reports.pdf \
-ft config.yaml
# Mixed syntax with custom variable names
ostruct run analysis_template.j2 analysis_schema.json \
-fc sales_data.csv \
-fc customers customer_data.json \
--fsa reports market_reports.pdf \
--fta app_config config.yaml
# Alias syntax for reusable templates (best tab completion)
ostruct run reusable_analysis.j2 analysis_schema.json \
--fca sales_data sales_data.csv \
--fca customer_data customer_data.json \
--fsa market_reports market_reports.pdf \
--fta config config.yaml
# Code review with stable variable names
ostruct run code_review.j2 review_schema.json \
--fca source_code source_code/ \
--fsa documentation docs/ \
--fta eslint_config .eslintrc.json
Example 3: Legacy Compatibility
All existing commands continue to work unchanged:
# Traditional usage (fully supported)
ostruct run extract_from_file.j2 schema.json -f text input.txt -d configs
ostruct run template.j2 schema.json -p "*.py" source -V env=prod
System Prompt Handling (Click to expand)
ostruct-cli provides three ways to specify a system prompt, with a clear precedence order:
-
Command-line option (
--sys-promptor--sys-file):# Direct string ostruct run template.j2 schema.json --sys-prompt "You are an expert analyst" # From file ostruct run template.j2 schema.json --sys-file system_prompt.txt -
Template frontmatter:
--- system_prompt: You are an expert analyst --- Extract information from: {{ text }} -
Shared system prompts (with template frontmatter):
--- include_system: shared/base_analyst.txt system_prompt: Focus on financial metrics --- Extract information from: {{ text }} -
Default system prompt (built into the CLI)
Precedence Rules
When multiple system prompts are provided, they are resolved in this order:
-
Command-line options take highest precedence:
- If both
--sys-promptand--sys-fileare provided,--sys-promptwins - Use
--ignore-task-syspromptto ignore template frontmatter
- If both
-
Template frontmatter is used if:
- No command-line options are provided
--ignore-task-syspromptis not set
-
Default system prompt is used only if no other prompts are provided
Example combining multiple sources:
# Command-line prompt will override template frontmatter
ostruct run template.j2 schema.json --sys-prompt "Override prompt"
# Ignore template frontmatter and use default
ostruct run template.j2 schema.json --ignore-task-sysprompt
Model Registry Management
ostruct-cli maintains a registry of OpenAI models and their capabilities, which includes:
- Context window sizes for each model
- Maximum output token limits
- Supported parameters and their constraints
- Model version information
To ensure you’re using the latest models and features, you can update the registry:
# Update from the official repository
ostruct update-registry
# Update from a custom URL
ostruct update-registry --url https://example.com/models.yml
# Force an update even if the registry is current
ostruct update-registry --force
This is especially useful when:
- New OpenAI models are released
- Model capabilities or parameters change
- You need to work with custom model configurations
The registry file is stored at ~/.openai_structured/config/models.yml and is automatically referenced when validating model parameters and token limits.
The update command uses HTTP conditional requests (If-Modified-Since headers) to check if the remote registry has changed before downloading, ensuring efficient updates.
Testing
Running Tests
The test suite is divided into two categories:
Regular Tests (Default)
# Run all tests (skips live tests by default)
pytest
# Run specific test file
pytest tests/test_config.py
# Run with verbose output
pytest -v
Live Tests
Live tests make real API calls to OpenAI and require a valid API key. They are skipped by default.
# Run only live tests (requires OPENAI_API_KEY)
pytest -m live
# Run all tests including live tests
pytest -m "live or not live"
# Run specific live test
pytest tests/test_responses_annotations.py -m live
Live tests include:
- Tests that make actual OpenAI API calls
- Tests that run
ostructcommands via subprocess - Tests that verify real API behavior and file downloads
Requirements for live tests:
- Valid
OPENAI_API_KEYenvironment variable - Internet connection
- May incur API costs
Test Markers
@pytest.mark.live- Tests that make real API calls or run actual commands@pytest.mark.no_fs- Tests that need real filesystem (not pyfakefs)@pytest.mark.slow- Performance/stress tests@pytest.mark.flaky- Tests that may need reruns@pytest.mark.mock_openai- Tests using mocked OpenAI client
Dev Tools Supporting MCP
The following are the main code editors that support the Model Context Protocol. Click the link to visit the official website for more information.










