Zen
ZE
Docker Hub MCP

Zen

by github.com/beehiveinnovations · Devops

0.0 · 0 reviews
0 installs · 18 tools

Bridges multiple AI models and CLIs, enabling orchestrated workflows across Claude Code, Gemini CLI, Codex CLI, and other AI development tools.

PAL MCP: Many Workflows. One Context.

Your AI's PAL – a Provider Abstraction Layer
Formerly known as Zen MCP [PAL in action](https://github.com/user-attachments/assets/0d26061e-5f21-4ab1-b7d0-f883ddc2c3da) 👉 **[Watch more examples](#-watch-tools-in-action)** ### Your CLI + Multiple Models = Your AI Dev Team **Use the 🤖 CLI you love:** [Claude Code](https://www.anthropic.com/claude-code) · [Gemini CLI](https://github.com/google-gemini/gemini-cli) · [Codex CLI](https://github.com/openai/codex) · [Qwen Code CLI](https://qwenlm.github.io/qwen-code-docs/) · [Cursor](https://cursor.com) · _and more_ **With multiple models within a single prompt:** Gemini · OpenAI · Anthropic · Grok · Azure · Ollama · OpenRouter · DIAL · On-Device Model

🆕 Now with CLI-to-CLI Bridge

The new clink (CLI + Link) tool connects external AI CLIs directly into your workflow:

  • Connect external CLIs like Gemini CLI, Codex CLI, and Claude Code directly into your workflow
  • CLI Subagents - Launch isolated CLI instances from within your current CLI! Claude Code can spawn Codex subagents, Codex can spawn Gemini CLI subagents, etc. Offload heavy tasks (code reviews, bug hunting) to fresh contexts while your main session's context window remains unpolluted. Each subagent returns only final results.
  • Context Isolation - Run separate investigations without polluting your primary workspace
  • Role Specialization - Spawn planner, codereviewer, or custom role agents with specialized system prompts
  • Full CLI Capabilities - Web search, file inspection, MCP tool access, latest documentation lookups
  • Seamless Continuity - Sub-CLIs participate as first-class members with full conversation context between tools
# Codex spawns Codex subagent for isolated code review in fresh context
clink with codex codereviewer to audit auth module for security issues
# Subagent reviews in isolation, returns final report without cluttering your context as codex reads each file and walks the directory structure

# Consensus from different AI models → Implementation handoff with full context preservation between tools
Use consensus with gpt-5 and gemini-pro to decide: dark mode or offline support next
Continue with clink gemini - implement the recommended feature
# Gemini receives full debate context and starts coding immediately

👉 Learn more about clink


Why PAL MCP?

Why rely on one AI model when you can orchestrate them all?

A Model Context Protocol server that supercharges tools like Claude Code, Codex CLI, and IDE clients such as Cursor or the Claude Dev VS Code extension. PAL MCP connects your favorite AI tool to multiple AI models for enhanced code analysis, problem-solving, and collaborative development.

True AI Collaboration with Conversation Continuity

PAL supports conversation threading so your CLI can discuss ideas with multiple AI models, exchange reasoning, get second opinions, and even run collaborative debates between models to help you reach deeper insights and better solutions.

Your CLI always stays in control but gets perspectives from the best AI for each subtask. Context carries forward seamlessly across tools and models, enabling complex workflows like: code reviews with multiple models → automated planning → implementation → pre-commit validation.

You're in control. Your CLI of choice orchestrates the AI team, but you decide the workflow. Craft powerful prompts that bring in Gemini Pro, GPT 5, Flash, or local offline models exactly when needed.

Reasons to Use PAL MCP A typical workflow with Claude Code as an example: 1. **Multi-Model Orchestration** - Claude coordinates with Gemini Pro, O3, GPT-5, and 50+ other models to get the best analysis for each task 2. **Context Revival Magic** - Even after Claude's context resets, continue conversations seamlessly by having other models "remind" Claude of the discussion 3. **Guided Workflows** - Enforces systematic investigation phases that prevent rushed analysis and ensure thorough code examination 4. **Extended Context Windows** - Break Claude's limits by delegating to Gemini (1M tokens) or O3 (200K tokens) for massive codebases 5. **True Conversation Continuity** - Full context flows across tools and models - Gemini remembers what O3 said 10 steps ago 6. **Model-Specific Strengths** - Extended thinking with Gemini Pro, blazing speed with Flash, strong reasoning with O3, privacy with local Ollama 7. **Professional Code Reviews** - Multi-pass analysis with severity levels, actionable feedback, and consensus from multiple AI experts 8. **Smart Debugging Assistant** - Systematic root cause analysis with hypothesis tracking and confidence levels 9. **Automatic Model Selection** - Claude intelligently picks the right model for each subtask (or you can specify) 10. **Vision Capabilities** - Analyze screenshots, diagrams, and visual content with vision-enabled models 11. **Local Model Support** - Run Llama, Mistral, or other models locally for complete privacy and zero API costs 12. **Bypass MCP Token Limits** - Automatically works around MCP's 25K limit for large prompts and responses **The Killer Feature:** When Claude's context resets, just ask to "continue with O3" - the other model's response magically revives Claude's understanding without re-ingesting documents! #### Example: Multi-Model Code Review Workflow 1. `Perform a codereview using gemini pro and o3 and use planner to generate a detailed plan, implement the fixes and do a final precommit check by continuing from the previous codereview` 2. This triggers a [`codereview`](docs/tools/codereview.md) workflow where Claude walks the code, looking for all kinds of issues 3. After multiple passes, collects relevant code and makes note of issues along the way 4. Maintains a `confidence` level between `exploring`, `low`, `medium`, `high` and `certain` to track how confidently it's been able to find and identify issues 5. Generates a detailed list of critical -> low issues 6. Shares the relevant files, findings, etc with **Gemini Pro** to perform a deep dive for a second [`codereview`](docs/tools/codereview.md) 7. Comes back with a response and next does the same with o3, adding to the prompt if a new discovery comes to light 8. When done, Claude takes in all the feedback and combines a single list of all critical -> low issues, including good patterns in your code. The final list includes new findings or revisions in case Claude misunderstood or missed something crucial and one of the other models pointed this out 9. It then uses the [`planner`](docs/tools/planner.md) workflow to break the work down into simpler steps if a major refactor is required 10. Claude then performs the actual work of fixing highlighted issues 11. When done, Claude returns to Gemini Pro for a [`precommit`](docs/tools/precommit.md) review All within a single conversation thread! Gemini Pro in step 11 _knows_ what was recommended by O3 in step 7! Taking that context and review into consideration to aid with its final pre-commit review. **Think of it as Claude Code _for_ Claude Code.** This MCP isn't magic. It's just **super-glue**. > **Remember:** Claude stays in full control — but **YOU** call the shots. > PAL is designed to have Claude engage other models only when needed — and to follow through with meaningful back-and-forth. > **You're** the one who crafts the powerful prompt that makes Claude bring in Gemini, Flash, O3 — or fly solo. > You're the guide. The prompter. The puppeteer. > #### You are the AI - **Actually Intelligent**.

Recommended AI Stack

For Claude Code Users For best results when using [Claude Code](https://claude.ai/code): - **Sonnet 4.5** - All agentic work and orchestration - **Gemini 3.0 Pro** OR **GPT-5.2 / Pro** - Deep thinking, additional code reviews, debugging and validations, pre-commit analysis
For Codex Users For best results when using [Codex CLI](https://developers.openai.com/codex/cli): - **GPT-5.2 Codex Medium** - All agentic work and orchestration - **Gemini 3.0 Pro** OR **GPT-5.2-Pro** - Deep thinking, additional code reviews, debugging and validations, pre-commit analysis

Quick Start (5 minutes)

Prerequisites: Python 3.10+, Git, uv installed

1. Get API Keys (choose one or more): - OpenRouter - Access multiple models with one API - Gemini - Google's latest models - OpenAI - O3, GPT-5 series - Azure OpenAI - Enterprise deployments of GPT-4o, GPT-4.1, GPT-5 family - X.AI - Grok models - DIAL - Vendor-agnostic model access - Ollama - Local models (free)

2. Install (choose one):

Option A: Clone and Automatic Setup (recommended)

git clone https://github.com/BeehiveInnovations/pal-mcp-server.git
cd pal-mcp-server

# Handles everything: setup, config, API keys from system environment. 
# Auto-configures Claude Desktop, Claude Code, Gemini CLI, Codex CLI, Qwen CLI
# Enable / disable additional settings in .env
./run-server.sh  

Option B: Instant Setup with uvx

// Add to ~/.claude/settings.json or .mcp.json
// Don't forget to add your API keys under env
{
  "mcpServers": {
    "pal": {
      "command": "bash",
      "args": ["-c", "for p in $(which uvx 2>/dev/null) $HOME/.local/bin/uvx /opt/homebrew/bin/uvx /usr/local/bin/uvx uvx; do [ -x \"$p\" ] && exec \"$p\" --from git+https://github.com/BeehiveInnovations/pal-mcp-server.git pal-mcp-server; done; echo 'uvx not found' >&2; exit 1"],
      "env": {
        "PATH": "/usr/local/bin:/usr/bin:/bin:/opt/homebrew/bin:~/.local/bin",
        "GEMINI_API_KEY": "your-key-here",
        "DISABLED_TOOLS": "analyze,refactor,testgen,secaudit,docgen,tracer",
        "DEFAULT_MODEL": "auto"
      }
    }
  }
}

3. Start Using!

"Use pal to analyze this code for security issues with gemini pro"
"Debug this error with o3 and then get flash to suggest optimizations"
"Plan the migration strategy with pal, get consensus from multiple models"
"clink with cli_name=\"gemini\" role=\"planner\" to draft a phased rollout plan"

👉 Complete Setup Guide with detailed installation, configuration for Gemini / Codex / Qwen, and troubleshooting 👉 Cursor & VS Code Setup for IDE integration instructions 📺 Watch tools in action to see real-world examples

Provider Configuration

PAL activates any provider that has credentials in your .env. See .env.example for deeper customization.

Core Tools

Note: Each tool comes with its own multi-step workflow, parameters, and descriptions that consume valuable context window space even when not in use. To optimize performance, some tools are disabled by default. See Tool Configuration below to enable them.

Collaboration & Planning (Enabled by default) - clink - Bridge requests to external AI CLIs (Gemini planner, codereviewer, etc.) - chat - Brainstorm ideas, get second opinions, validate approaches. With capable models (GPT-5.2 Pro, Gemini 3.0 Pro), generates complete code / implementation - thinkdeep - Extended reasoning, edge case analysis, alternative perspectives - planner - Break down complex projects into structured, actionable plans - consensus - Get expert opinions from multiple AI models with stance steering

Code Analysis & Quality - debug - Systematic investigation and root cause analysis - precommit - Validate changes before committing, prevent regressions - codereview - Professional reviews with severity levels and actionable feedback - analyze (disabled by default - enable) - Understand architecture, patterns, dependencies across entire codebases

Development Tools (Disabled by default - enable) - refactor - Intelligent code refactoring with decomposition focus - testgen - Comprehensive test generation with edge cases - secaudit - Security audits with OWASP Top 10 analysis - docgen - Generate documentation with complexity analysis

Utilities - apilookup - Forces current-year API/SDK documentation lookups in a sub-process (saves tokens within the current context window), prevents outdated training data responses - challenge - Prevent "You're absolutely right!" responses with critical analysis - tracer (disabled by default - enable) - Static analysis prompts for call-flow mapping

👉 Tool Configuration ### Default Configuration To optimize context window usage, only essential tools are enabled by default: **Enabled by default:** - `chat`, `thinkdeep`, `planner`, `consensus` - Core collaboration tools - `codereview`, `precommit`, `debug` - Essential code quality tools - `apilookup` - Rapid API/SDK information lookup - `challenge` - Critical thinking utility **Disabled by default:** - `analyze`, `refactor`, `testgen`, `secaudit`, `docgen`, `tracer` ### Enabling Additional Tools To enable additional tools, remove them from the `DISABLED_TOOLS` list: **Option 1: Edit your .env file**
# Default configuration (from .env.example)
DISABLED_TOOLS=analyze,refactor,testgen,secaudit,docgen,tracer

# To enable specific tools, remove them from the list
# Example: Enable analyze tool
DISABLED_TOOLS=refactor,testgen,secaudit,docgen,tracer

# To enable ALL tools
DISABLED_TOOLS=
**Option 2: Configure in MCP settings**
// In ~/.claude/settings.json or .mcp.json
{
  "mcpServers": {
    "pal": {
      "env": {
        // Tool configuration
        "DISABLED_TOOLS": "refactor,testgen,secaudit,docgen,tracer",
        "DEFAULT_MODEL": "pro",
        "DEFAULT_THINKING_MODE_THINKDEEP": "high",

        // API configuration
        "GEMINI_API_KEY": "your-gemini-key",
        "OPENAI_API_KEY": "your-openai-key",
        "OPENROUTER_API_KEY": "your-openrouter-key",

        // Logging and performance
        "LOG_LEVEL": "INFO",
        "CONVERSATION_TIMEOUT_HOURS": "6",
        "MAX_CONVERSATION_TURNS": "50"
      }
    }
  }
}
**Option 3: Enable all tools**
// Remove or empty the DISABLED_TOOLS to enable everything
{
  "mcpServers": {
    "pal": {
      "env": {
        "DISABLED_TOOLS": ""
      }
    }
  }
}
**Note:** - Essential tools (`version`, `listmodels`) cannot be disabled - After changing tool configuration, restart your Claude session for changes to take effect - Each tool adds to context window usage, so only enable what you need

📺 Watch Tools In Action

Chat Tool - Collaborative decision making and multi-turn conversations **Picking Redis vs Memcached:** [Chat Redis or Memcached_web.webm](https://github.com/user-attachments/assets/41076cfe-dd49-4dfc-82f5-d7461b34705d) **Multi-turn conversation with continuation:** [Chat With Gemini_web.webm](https://github.com/user-attachments/assets/37bd57ca-e8a6-42f7-b5fb-11de271e95db)
Consensus Tool - Multi-model debate and decision making **Multi-model consensus debate:** [PAL Consensus Debate](https://github.com/user-attachments/assets/76a23dd5-887a-4382-9cf0-642f5cf6219e)
PreCommit Tool - Comprehensive change validation **Pre-commit validation workflow:**
API Lookup Tool - Current vs outdated API documentation **Without PAL - outdated APIs:** [API without PAL](https://github.com/user-attachments/assets/01a79dc9-ad16-4264-9ce1-76a56c3580ee) **With PAL - current APIs:** [API with PAL](https://github.com/user-attachments/assets/5c847326-4b66-41f7-8f30-f380453dce22)
Challenge Tool - Critical thinking vs reflexive agreement **Without PAL:** ![without_pal@2x](https://github.com/user-attachments/assets/64f3c9fb-7ca9-4876-b687-25e847edfd87) **With PAL:** ![with_pal@2x](https://github.com/user-attachments/assets/9d72f444-ba53-4ab1-83e5-250062c6ee70)

Key Features

AI Orchestration - Auto model selection - Claude picks the right AI for each task - Multi-model workflows - Chain different models in single conversations - Conversation continuity - Context preserved across tools and models - Context revival - Continue conversations even after context resets

Model Support - Multiple providers - Gemini, OpenAI, Azure, X.AI, OpenRouter, DIAL, Ollama - Latest models - GPT-5, Gemini 3.0 Pro, O3, Grok-4, local Llama - Thinking modes - Control reasoning depth vs cost - Vision support - Analyze images, diagrams, screenshots

Developer Experience - Guided workflows - Systematic investigation prevents rushed analysis - Smart file handling - Auto-expand directories, manage token limits - Web search integration - Access current documentation and best practices - Large prompt support - Bypass MCP's 25K token limit

Example Workflows

Multi-model Code Review:

"Perform a codereview using gemini pro and o3, then use planner to create a fix strategy"

→ Claude reviews code systematically → Consults Gemini Pro → Gets O3's perspective → Creates unified action plan

Collaborative Debugging:

"Debug this race condition with max thinking mode, then validate the fix with precommit"

→ Deep investigation → Expert analysis → Solution implementation → Pre-commit validation

Architecture Planning:

"Plan our microservices migration, get consensus from pro and o3 on the approach"

→ Structured planning → Multiple expert opinions → Consensus building → Implementation roadmap

👉 Advanced Usage Guide for complex workflows, model configuration, and power-user features

Quick Links

📖 Documentation - Docs Overview - High-level map of major guides - Getting Started - Complete setup guide - Tools Reference - All tools with examples - Advanced Usage - Power user features - Configuration - Environment variables, restrictions - Adding Providers - Provider-specific setup (OpenAI, Azure, custom gateways) - Model Ranking Guide - How intelligence scores drive auto-mode suggestions

🔧 Setup & Support - WSL Setup - Windows users - Troubleshooting - Common issues - Contributing - Code standards, PR process

License

Apache 2.0 License - see LICENSE file for details.

Acknowledgments

Built with the power of Multi-Model AI collaboration 🤝 - Actual Intelligence by real Humans - MCP (Model Context Protocol) - Codex CLI - Claude Code - Gemini - OpenAI - Azure OpenAI

Star History

Star History Chart

chat General chat and collaborative thinking partner for brainstorming, development discussion, getting second opinions, and exploring ideas. Use for ideas, validations, questions, and thoughtful explanations.

Parameters

prompt Your question or idea for collaborative thinking to be sent to the external model. Provide detailed context, including your goal, what you've tried, and any specific challenges. WARNING: Large inline code must NOT be shared in prompt. Provide full-path to files on disk as separate parameter. required
absolute_file_paths Full, absolute file paths to relevant code in order to share with external model
images Image paths (absolute) or base64 strings for optional visual context.
working_directory_absolute_path Absolute path to an existing directory where generated code artifacts can be saved. required
model Currently in auto model selection mode. CRITICAL: When the user names a model, you MUST use that exact name unless the server rejects it. If no model is provided, you may use the `listmodels` tool to review options and select an appropriate match. Top models: gpt-5.2 (score 100, 400K ctx, thinking, code-gen); gpt-5.1-codex (score 100, 400K ctx, thinking, code-gen); gemini-2.5-pro (score 100, 1.0M ctx, thinking, code-gen); gemini-3-pro-preview (score 100, 1.0M ctx, thinking, code-gen); gpt-5.2-pro (score 100, 400K ctx, thinking, code-gen); +26 more via `listmodels`. required
temperature 0 = deterministic · 1 = creative.
thinking_mode Reasoning depth: minimal, low, medium, high, or max.
continuation_id Unique thread continuation ID for multi-turn conversations. Works across different tools. ALWAYS reuse the last continuation_id you were given—this preserves full conversation context, files, and findings so the agent can resume seamlessly.
clink Link a request to an external AI CLI (Gemini CLI, Qwen CLI, etc.) through PAL MCP to reuse their capabilities inside existing workflows.

Parameters

prompt User request forwarded to the CLI (conversation context is pre-applied). required
cli_name Configured CLI client name (from conf/cli_clients). Available: claude, codex, gemini required
role Optional role preset defined for the selected CLI (defaults to 'default'). Roles per CLI: claude: codereviewer, default, planner; codex: codereviewer, default, planner; gemini: codereviewer, default, planner
absolute_file_paths Full paths to relevant code
images Optional absolute image paths or base64 blobs for visual context.
continuation_id Unique thread continuation ID for multi-turn conversations. Works across different tools. ALWAYS reuse the last continuation_id you were given—this preserves full conversation context, files, and findings so the agent can resume seamlessly.
thinkdeep Performs multi-stage investigation and reasoning for complex problem analysis. Use for architecture decisions, complex bugs, performance challenges, and security analysis. Provides systematic hypothesis testing, evidence-based investigation, and expert validation.

Parameters

step Current work step content and findings from your overall work required
step_number Current step number in work sequence (starts at 1) required
total_steps Estimated total steps needed to complete work required
next_step_required Whether another work step is needed. When false, aim to reduce total_steps to match step_number to avoid mismatch. required
findings Important findings, evidence and insights discovered in this step required
files_checked List of files examined during this work step
relevant_files Files identified as relevant to issue/goal (FULL absolute paths to real files/folders - DO NOT SHORTEN)
relevant_context Methods/functions identified as involved in the issue
issues_found Issues identified with severity levels during work
confidence Confidence level: exploring (just starting), low (early investigation), medium (some evidence), high (strong evidence), very_high (comprehensive understanding), almost_certain (near complete confidence), certain (100% confidence locally - no external validation needed)
hypothesis Current theory about issue/goal based on work
use_assistant_model Use assistant model for expert analysis after workflow steps. False skips expert analysis, relies solely on your personal investigation. Defaults to True for comprehensive validation.
temperature 0 = deterministic · 1 = creative.
thinking_mode Reasoning depth: minimal, low, medium, high, or max.
continuation_id Unique thread continuation ID for multi-turn conversations. Works across different tools. ALWAYS reuse the last continuation_id you were given—this preserves full conversation context, files, and findings so the agent can resume seamlessly.
images Optional absolute image paths or base64 blobs for visual context.
model Currently in auto model selection mode. CRITICAL: When the user names a model, you MUST use that exact name unless the server rejects it. If no model is provided, you may use the `listmodels` tool to review options and select an appropriate match. Top models: gpt-5.2 (score 100, 400K ctx, thinking, code-gen); gpt-5.1-codex (score 100, 400K ctx, thinking, code-gen); gemini-2.5-pro (score 100, 1.0M ctx, thinking, code-gen); gemini-3-pro-preview (score 100, 1.0M ctx, thinking, code-gen); gpt-5.2-pro (score 100, 400K ctx, thinking, code-gen); +26 more via `listmodels`. required
problem_context Additional context about problem/goal. Be expressive.
focus_areas Focus aspects (architecture, performance, security, etc.)
planner Breaks down complex tasks through interactive, sequential planning with revision and branching capabilities. Use for complex project planning, system design, migration strategies, and architectural decisions. Builds plans incrementally with deep reflection for complex scenarios.

Parameters

step Planning content for this step. Step 1: describe the task, problem and scope. Later steps: capture updates, revisions, branches, or open questions that shape the plan. required
step_number Current step number in work sequence (starts at 1) required
total_steps Estimated total steps needed to complete work required
next_step_required Whether another work step is needed. When false, aim to reduce total_steps to match step_number to avoid mismatch. required
use_assistant_model Use assistant model for expert analysis after workflow steps. False skips expert analysis, relies solely on your personal investigation. Defaults to True for comprehensive validation.
continuation_id Unique thread continuation ID for multi-turn conversations. Works across different tools. ALWAYS reuse the last continuation_id you were given—this preserves full conversation context, files, and findings so the agent can resume seamlessly.
model Currently in auto model selection mode. CRITICAL: When the user names a model, you MUST use that exact name unless the server rejects it. If no model is provided, you may use the `listmodels` tool to review options and select an appropriate match. Top models: gpt-5.2 (score 100, 400K ctx, thinking, code-gen); gpt-5.1-codex (score 100, 400K ctx, thinking, code-gen); gemini-2.5-pro (score 100, 1.0M ctx, thinking, code-gen); gemini-3-pro-preview (score 100, 1.0M ctx, thinking, code-gen); gpt-5.2-pro (score 100, 400K ctx, thinking, code-gen); +26 more via `listmodels`. required
is_step_revision Set true when you are replacing a previously recorded step.
revises_step_number Step number being replaced when revising.
is_branch_point True when this step creates a new branch to explore an alternative path.
branch_from_step If branching, the step number that this branch starts from.
branch_id Name for this branch (e.g. 'approach-A', 'migration-path').
more_steps_needed True when you now expect to add additional steps beyond the prior estimate.
consensus Builds multi-model consensus through systematic analysis and structured debate. Use for complex decisions, architectural choices, feature proposals, and technology evaluations. Consults multiple models with different stances to synthesize comprehensive recommendations.

Parameters

step Consensus prompt. Step 1: write the exact proposal/question every model will see (use 'Evaluate…', not meta commentary). Steps 2+: capture internal notes about the latest model response—these notes are NOT sent to other models. required
step_number Current step index (starts at 1). Step 1 is your analysis; steps 2+ handle each model response. required
total_steps Total steps = number of models consulted plus the final synthesis step. required
next_step_required True if more model consultations remain; set false when ready to synthesize. required
findings Step 1: your independent analysis for later synthesis (not shared with other models). Steps 2+: summarize the newest model response. required
relevant_files Optional supporting files that help the consensus analysis. Must be absolute full, non-abbreviated paths.
use_assistant_model Use assistant model for expert analysis after workflow steps. False skips expert analysis, relies solely on your personal investigation. Defaults to True for comprehensive validation.
continuation_id Unique thread continuation ID for multi-turn conversations. Works across different tools. ALWAYS reuse the last continuation_id you were given—this preserves full conversation context, files, and findings so the agent can resume seamlessly.
images Optional absolute image paths or base64 references that add helpful visual context.
models User-specified roster of models to consult (provide at least two entries). User-specified list of models to consult (provide at least two entries). Each entry may include model, stance (for/against/neutral), and stance_prompt. Each (model, stance) pair must be unique, e.g. [{'model':'gpt5','stance':'for'}, {'model':'pro','stance':'against'}]. When the user names a model, you MUST use that exact value or report the provider error—never swap in another option. Use the `listmodels` tool for the full roster. Top models: gpt-5.2 (score 100, 400K ctx, thinking, code-gen); gpt-5.1-codex (score 100, 400K ctx, thinking, code-gen); gemini-2.5-pro (score 100, 1.0M ctx, thinking, code-gen); gemini-3-pro-preview (score 100, 1.0M ctx, thinking, code-gen); gpt-5.2-pro (score 100, 400K ctx, thinking, code-gen); +26 more via `listmodels`.
current_model_index 0-based index of the next model to consult (managed internally).
model_responses Internal log of responses gathered so far.
codereview Performs systematic, step-by-step code review with expert validation. Use for comprehensive analysis covering quality, security, performance, and architecture. Guides through structured investigation to ensure thoroughness.

Parameters

step Review narrative. Step 1: outline the review strategy. Later steps: report findings. MUST cover quality, security, performance, and architecture. Reference code via `relevant_files`; avoid dumping large snippets. required
step_number Current review step (starts at 1) – each step should build on the last. required
total_steps Number of review steps planned. External validation: two steps (analysis + summary). Internal validation: one step. Use the same limits when continuing an existing review via continuation_id. required
next_step_required True when another review step follows. External validation: step 1 → True, step 2 → False. Internal validation: set False immediately. Apply the same rule on continuation flows. required
findings Capture findings (positive and negative) across quality, security, performance, and architecture; update each step. required
files_checked Absolute paths of every file reviewed, including those ruled out.
relevant_files Step 1: list all files/dirs under review. Must be absolute full non-abbreviated paths. Final step: narrow to files tied to key findings.
relevant_context Methods/functions identified as involved in the issue
issues_found Issues with severity (critical/high/medium/low) and descriptions.
confidence Confidence level: exploring (just starting), low (early investigation), medium (some evidence), high (strong evidence), very_high (comprehensive understanding), almost_certain (near complete confidence), certain (100% confidence locally - no external validation needed)
hypothesis Current theory about issue/goal based on work
use_assistant_model Use assistant model for expert analysis after workflow steps. False skips expert analysis, relies solely on your personal investigation. Defaults to True for comprehensive validation.
temperature 0 = deterministic · 1 = creative.
thinking_mode Reasoning depth: minimal, low, medium, high, or max.
continuation_id Unique thread continuation ID for multi-turn conversations. Works across different tools. ALWAYS reuse the last continuation_id you were given—this preserves full conversation context, files, and findings so the agent can resume seamlessly.
images Optional diagram or screenshot paths that clarify review context.
model Currently in auto model selection mode. CRITICAL: When the user names a model, you MUST use that exact name unless the server rejects it. If no model is provided, you may use the `listmodels` tool to review options and select an appropriate match. Top models: gpt-5.2 (score 100, 400K ctx, thinking, code-gen); gpt-5.1-codex (score 100, 400K ctx, thinking, code-gen); gemini-2.5-pro (score 100, 1.0M ctx, thinking, code-gen); gemini-3-pro-preview (score 100, 1.0M ctx, thinking, code-gen); gpt-5.2-pro (score 100, 400K ctx, thinking, code-gen); +26 more via `listmodels`. required
review_validation_type Set 'external' (default) for expert follow-up or 'internal' for local-only review.
review_type Review focus: full, security, performance, or quick.
focus_on Optional note on areas to emphasise (e.g. 'threading', 'auth flow').
standards Coding standards or style guides to enforce.
severity_filter Lowest severity to include when reporting issues (critical/high/medium/low/all).
precommit Validates git changes and repository state before committing with systematic analysis. Use for multi-repository validation, security review, change impact assessment, and completeness verification. Guides through structured investigation with expert analysis.

Parameters

step Step 1: outline how you'll validate the git changes. Later steps: report findings. Review diffs and impacts, use `relevant_files`, and avoid pasting large snippets. required
step_number Current pre-commit step number (starts at 1). required
total_steps Planned number of validation steps. External validation: use at most three (analysis → follow-ups → summary). Internal validation: a single step. Honour these limits when resuming via continuation_id. required
next_step_required True to continue with another step, False when validation is complete. CRITICAL: If total_steps>=3 or when `precommit_type = external`, set to True until the final step. When continuation_id is provided: Follow the same validation rules based on precommit_type. required
findings Record git diff insights, risks, missing tests, security concerns, and positives; update previous notes as you go. required
files_checked Absolute paths for every file examined, including ruled-out candidates.
relevant_files Absolute paths of files involved in the change or validation (code, configs, tests, docs). Must be absolute full non-abbreviated paths.
relevant_context Methods/functions identified as involved in the issue
issues_found List issues with severity (critical/high/medium/low) plus descriptions (bugs, security, performance, coverage).
confidence Confidence level: exploring (just starting), low (early investigation), medium (some evidence), high (strong evidence), very_high (comprehensive understanding), almost_certain (near complete confidence), certain (100% confidence locally - no external validation needed)
hypothesis Current theory about issue/goal based on work
use_assistant_model Use assistant model for expert analysis after workflow steps. False skips expert analysis, relies solely on your personal investigation. Defaults to True for comprehensive validation.
temperature 0 = deterministic · 1 = creative.
thinking_mode Reasoning depth: minimal, low, medium, high, or max.
continuation_id Unique thread continuation ID for multi-turn conversations. Works across different tools. ALWAYS reuse the last continuation_id you were given—this preserves full conversation context, files, and findings so the agent can resume seamlessly.
images Optional absolute paths to screenshots or diagrams that aid validation.
model Currently in auto model selection mode. CRITICAL: When the user names a model, you MUST use that exact name unless the server rejects it. If no model is provided, you may use the `listmodels` tool to review options and select an appropriate match. Top models: gpt-5.2 (score 100, 400K ctx, thinking, code-gen); gpt-5.1-codex (score 100, 400K ctx, thinking, code-gen); gemini-2.5-pro (score 100, 1.0M ctx, thinking, code-gen); gemini-3-pro-preview (score 100, 1.0M ctx, thinking, code-gen); gpt-5.2-pro (score 100, 400K ctx, thinking, code-gen); +26 more via `listmodels`. required
precommit_type 'external' (default, triggers expert model) or 'internal' (local-only validation).
path Absolute path to the repository root. Required in step 1.
compare_to Optional git ref (branch/tag/commit) to diff against; falls back to staged/unstaged changes.
include_staged Whether to inspect staged changes (ignored when `compare_to` is set).
include_unstaged Whether to inspect unstaged changes (ignored when `compare_to` is set).
focus_on Optional emphasis areas such as security, performance, or test coverage.
severity_filter Lowest severity to include when reporting issues.
debug Performs systematic debugging and root cause analysis for any type of issue. Use for complex bugs, mysterious errors, performance issues, race conditions, memory leaks, and integration problems. Guides through structured investigation with hypothesis testing and expert analysis.

Parameters

step Investigation step. Step 1: State issue+direction. Symptoms misleading; 'no bug' valid. Trace dependencies, verify hypotheses. Use relevant_files for code; this for text only. required
step_number Current step index (starts at 1). Build upon previous steps. required
total_steps Estimated total steps needed to complete the investigation. Adjust as new findings emerge. IMPORTANT: When continuation_id is provided (continuing a previous conversation), set this to 1 as we're not starting a new multi-step investigation. required
next_step_required True if you plan to continue the investigation with another step. False means root cause is known or investigation is complete. IMPORTANT: When continuation_id is provided (continuing a previous conversation), set this to False to immediately proceed with expert analysis. required
findings Discoveries: clues, code/log evidence, disproven theories. Be specific. If no bug found, document clearly as valid. required
files_checked All examined files (absolute paths), including ruled-out ones.
relevant_files Files directly relevant to issue (absolute paths). Cause, trigger, or manifestation locations.
relevant_context Methods/functions identified as involved in the issue
issues_found Issues identified with severity levels during work
confidence Your confidence in the hypothesis: exploring (starting out), low (early idea), medium (some evidence), high (strong evidence), very_high (very strong evidence), almost_certain (nearly confirmed), certain (100% confidence - root cause and fix are both confirmed locally with no need for external validation). WARNING: Do NOT use 'certain' unless the issue can be fully resolved with a fix, use 'very_high' or 'almost_certain' instead when not 100% sure. Using 'certain' means you have ABSOLUTE confidence locally and PREVENTS external model validation.
hypothesis Concrete root cause theory from evidence. Can revise. Valid: 'No bug found - user misunderstanding' or 'Symptoms unrelated to code' if supported.
use_assistant_model Use assistant model for expert analysis after workflow steps. False skips expert analysis, relies solely on your personal investigation. Defaults to True for comprehensive validation.
temperature 0 = deterministic · 1 = creative.
thinking_mode Reasoning depth: minimal, low, medium, high, or max.
continuation_id Unique thread continuation ID for multi-turn conversations. Works across different tools. ALWAYS reuse the last continuation_id you were given—this preserves full conversation context, files, and findings so the agent can resume seamlessly.
images Optional screenshots/visuals clarifying issue (absolute paths).
model Currently in auto model selection mode. CRITICAL: When the user names a model, you MUST use that exact name unless the server rejects it. If no model is provided, you may use the `listmodels` tool to review options and select an appropriate match. Top models: gpt-5.2 (score 100, 400K ctx, thinking, code-gen); gpt-5.1-codex (score 100, 400K ctx, thinking, code-gen); gemini-2.5-pro (score 100, 1.0M ctx, thinking, code-gen); gemini-3-pro-preview (score 100, 1.0M ctx, thinking, code-gen); gpt-5.2-pro (score 100, 400K ctx, thinking, code-gen); +26 more via `listmodels`. required
secaudit Performs comprehensive security audit with systematic vulnerability assessment. Use for OWASP Top 10 analysis, compliance evaluation, threat modeling, and security architecture review. Guides through structured security investigation with expert validation.

Parameters

step Step 1: outline the audit strategy (OWASP Top 10, auth, validation, etc.). Later steps: report findings. MANDATORY: use `relevant_files` for code references and avoid large snippets. required
step_number Current security-audit step number (starts at 1). required
total_steps Expected number of audit steps; adjust as new risks surface. required
next_step_required True while additional threat analysis remains; set False once you are ready to hand off for validation. required
findings Summarize vulnerabilities, auth issues, validation gaps, compliance notes, and positives; update prior findings as needed. required
files_checked Absolute paths for every file inspected, including rejected candidates.
relevant_files Absolute paths for security-relevant files (auth modules, configs, sensitive code).
relevant_context Methods/functions identified as involved in the issue
issues_found Security issues with severity (critical/high/medium/low) and descriptions (vulns, auth flaws, injection, crypto, config).
confidence exploring/low/medium/high/very_high/almost_certain/certain. 'certain' blocks external validation—use only when fully complete.
hypothesis Current theory about issue/goal based on work
use_assistant_model Use assistant model for expert analysis after workflow steps. False skips expert analysis, relies solely on your personal investigation. Defaults to True for comprehensive validation.
temperature 0 = deterministic · 1 = creative.
thinking_mode Reasoning depth: minimal, low, medium, high, or max.
continuation_id Unique thread continuation ID for multi-turn conversations. Works across different tools. ALWAYS reuse the last continuation_id you were given—this preserves full conversation context, files, and findings so the agent can resume seamlessly.
images Optional absolute paths to diagrams or threat models that inform the audit.
model Currently in auto model selection mode. CRITICAL: When the user names a model, you MUST use that exact name unless the server rejects it. If no model is provided, you may use the `listmodels` tool to review options and select an appropriate match. Top models: gpt-5.2 (score 100, 400K ctx, thinking, code-gen); gpt-5.1-codex (score 100, 400K ctx, thinking, code-gen); gemini-2.5-pro (score 100, 1.0M ctx, thinking, code-gen); gemini-3-pro-preview (score 100, 1.0M ctx, thinking, code-gen); gpt-5.2-pro (score 100, 400K ctx, thinking, code-gen); +26 more via `listmodels`. required
security_scope Security context (web, mobile, API, cloud, etc.) including stack, user types, data sensitivity, and threat landscape.
threat_level Assess the threat level: low (internal/low-risk), medium (customer-facing/business data), high (regulated or sensitive), critical (financial/healthcare/PII).
compliance_requirements Applicable compliance frameworks or standards (SOC2, PCI DSS, HIPAA, GDPR, ISO 27001, NIST, etc.).
audit_focus Primary focus area: owasp, compliance, infrastructure, dependencies, or comprehensive.
severity_filter Minimum severity to include when reporting security issues.
docgen Generates comprehensive code documentation with systematic analysis of functions, classes, and complexity. Use for documentation generation, code analysis, complexity assessment, and API documentation. Analyzes code structure and patterns to create thorough documentation.

Parameters

step Current work step content and findings from your overall work required
step_number Current step number in work sequence (starts at 1) required
total_steps Estimated total steps needed to complete work required
next_step_required Whether another work step is needed. When false, aim to reduce total_steps to match step_number to avoid mismatch. required
findings Important findings, evidence and insights discovered in this step required
relevant_files Files identified as relevant to issue/goal (FULL absolute paths to real files/folders - DO NOT SHORTEN)
relevant_context Methods/functions identified as involved in the issue
issues_found Issues identified with severity levels during work
use_assistant_model Use assistant model for expert analysis after workflow steps. False skips expert analysis, relies solely on your personal investigation. Defaults to True for comprehensive validation.
continuation_id Unique thread continuation ID for multi-turn conversations. Works across different tools. ALWAYS reuse the last continuation_id you were given—this preserves full conversation context, files, and findings so the agent can resume seamlessly.
document_complexity Include algorithmic complexity (Big O) analysis when True (default). required
document_flow Include call flow/dependency notes when True (default). required
update_existing True (default) to polish inaccurate or outdated docs instead of leaving them untouched. required
comments_on_complex_logic True (default) to add inline comments around non-obvious logic. required
num_files_documented Count of files finished so far. Increment only when a file is fully documented. required
total_files_to_document Total files identified in discovery; completion requires matching this count. required
analyze Performs comprehensive code analysis with systematic investigation and expert validation. Use for architecture, performance, maintainability, and pattern analysis. Guides through structured code review and strategic planning.

Parameters

step The analysis plan. Step 1: State your strategy, including how you will map the codebase structure, understand business logic, and assess code quality, performance implications, and architectural patterns. Later steps: Report findings and adapt the approach as new insights emerge. required
step_number The index of the current step in the analysis sequence, beginning at 1. Each step should build upon or revise the previous one. required
total_steps Your current estimate for how many steps will be needed to complete the analysis. Adjust as new findings emerge. required
next_step_required Set to true if you plan to continue the investigation with another step. False means you believe the analysis is complete and ready for expert validation. required
findings Summary of discoveries from this step, including architectural patterns, tech stack assessment, scalability characteristics, performance implications, maintainability factors, and strategic improvement opportunities. IMPORTANT: Document both strengths (good patterns, solid architecture) and concerns (tech debt, overengineering, unnecessary complexity). In later steps, confirm or update past findings with additional evidence. required
files_checked List all files examined (absolute paths). Include even ruled-out files to track exploration path.
relevant_files Subset of files_checked directly relevant to analysis findings (absolute paths). Include files with significant patterns, architectural decisions, or strategic improvement opportunities.
relevant_context Methods/functions identified as involved in the issue
issues_found Issues or concerns identified during analysis, each with severity level (critical, high, medium, low)
use_assistant_model Use assistant model for expert analysis after workflow steps. False skips expert analysis, relies solely on your personal investigation. Defaults to True for comprehensive validation.
temperature 0 = deterministic · 1 = creative.
thinking_mode Reasoning depth: minimal, low, medium, high, or max.
continuation_id Unique thread continuation ID for multi-turn conversations. Works across different tools. ALWAYS reuse the last continuation_id you were given—this preserves full conversation context, files, and findings so the agent can resume seamlessly.
images Optional absolute paths to architecture diagrams or visual references that help with analysis context.
model Currently in auto model selection mode. CRITICAL: When the user names a model, you MUST use that exact name unless the server rejects it. If no model is provided, you may use the `listmodels` tool to review options and select an appropriate match. Top models: gpt-5.2 (score 100, 400K ctx, thinking, code-gen); gpt-5.1-codex (score 100, 400K ctx, thinking, code-gen); gemini-2.5-pro (score 100, 1.0M ctx, thinking, code-gen); gemini-3-pro-preview (score 100, 1.0M ctx, thinking, code-gen); gpt-5.2-pro (score 100, 400K ctx, thinking, code-gen); +26 more via `listmodels`. required
confidence Your confidence in the analysis: exploring, low, medium, high, very_high, almost_certain, or certain. 'certain' indicates the analysis is complete and ready for validation.
analysis_type Type of analysis to perform (architecture, performance, security, quality, general)
output_format How to format the output (summary, detailed, actionable)
refactor Analyzes code for refactoring opportunities with systematic investigation. Use for code smell detection, decomposition planning, modernization, and maintainability improvements. Guides through structured analysis with expert validation.

Parameters

step The refactoring plan. Step 1: State strategy. Later steps: Report findings. CRITICAL: Examine code for smells, and opportunities for decomposition, modernization, and organization. Use 'relevant_files' for code. FORBIDDEN: Large code snippets. required
step_number The index of the current step in the refactoring investigation sequence, beginning at 1. Each step should build upon or revise the previous one. required
total_steps Your current estimate for how many steps will be needed to complete the refactoring investigation. Adjust as new opportunities emerge. required
next_step_required Set to true if you plan to continue the investigation with another step. False means you believe the refactoring analysis is complete and ready for expert validation. required
findings Summary of discoveries from this step, including code smells and opportunities for decomposition, modernization, or organization. Document both strengths and weaknesses. In later steps, confirm or update past findings. required
files_checked List all files examined (absolute paths). Include even ruled-out files to track exploration path.
relevant_files Subset of files_checked with code requiring refactoring (absolute paths). Include files with code smells, decomposition needs, or improvement opportunities.
relevant_context Methods/functions identified as involved in the issue
issues_found Refactoring opportunities as dictionaries with 'severity' (critical/high/medium/low), 'type' (codesmells/decompose/modernize/organization), and 'description'. Include all improvement opportunities found.
confidence Your confidence in refactoring analysis: exploring (starting), incomplete (significant work remaining), partial (some opportunities found, more analysis needed), complete (comprehensive analysis finished, all major opportunities identified). WARNING: Use 'complete' ONLY when fully analyzed and can provide recommendations without expert help. 'complete' PREVENTS expert validation. Use 'partial' for large files or uncertain analysis.
hypothesis Current theory about issue/goal based on work
use_assistant_model Use assistant model for expert analysis after workflow steps. False skips expert analysis, relies solely on your personal investigation. Defaults to True for comprehensive validation.
temperature 0 = deterministic · 1 = creative.
thinking_mode Reasoning depth: minimal, low, medium, high, or max.
continuation_id Unique thread continuation ID for multi-turn conversations. Works across different tools. ALWAYS reuse the last continuation_id you were given—this preserves full conversation context, files, and findings so the agent can resume seamlessly.
images Optional list of absolute paths to architecture diagrams, UI mockups, design documents, or visual references that help with refactoring context. Only include if they materially assist understanding or assessment.
model Currently in auto model selection mode. CRITICAL: When the user names a model, you MUST use that exact name unless the server rejects it. If no model is provided, you may use the `listmodels` tool to review options and select an appropriate match. Top models: gpt-5.2 (score 100, 400K ctx, thinking, code-gen); gpt-5.1-codex (score 100, 400K ctx, thinking, code-gen); gemini-2.5-pro (score 100, 1.0M ctx, thinking, code-gen); gemini-3-pro-preview (score 100, 1.0M ctx, thinking, code-gen); gpt-5.2-pro (score 100, 400K ctx, thinking, code-gen); +26 more via `listmodels`. required
refactor_type Type of refactoring analysis to perform (codesmells, decompose, modernize, organization)
focus_areas Specific areas to focus on (e.g., 'performance', 'readability', 'maintainability', 'security')
style_guide_examples Optional existing code files to use as style/pattern reference (must be FULL absolute paths to real files / folders - DO NOT SHORTEN). These files represent the target coding style and patterns for the project.
tracer Performs systematic code tracing with modes for execution flow or dependency mapping. Use for method execution analysis, call chain tracing, dependency mapping, and architectural understanding. Supports precision mode (execution flow) and dependencies mode (structural relationships).

Parameters

step Current work step content and findings from your overall work required
step_number Current step number in work sequence (starts at 1) required
total_steps Estimated total steps needed to complete work required
next_step_required Whether another work step is needed. When false, aim to reduce total_steps to match step_number to avoid mismatch. required
findings Important findings, evidence and insights discovered in this step required
files_checked List of files examined during this work step
relevant_files Files identified as relevant to issue/goal (FULL absolute paths to real files/folders - DO NOT SHORTEN)
relevant_context Methods/functions identified as involved in the issue
confidence Confidence level: exploring (just starting), low (early investigation), medium (some evidence), high (strong evidence), very_high (comprehensive understanding), almost_certain (near complete confidence), certain (100% confidence locally - no external validation needed)
use_assistant_model Use assistant model for expert analysis after workflow steps. False skips expert analysis, relies solely on your personal investigation. Defaults to True for comprehensive validation.
continuation_id Unique thread continuation ID for multi-turn conversations. Works across different tools. ALWAYS reuse the last continuation_id you were given—this preserves full conversation context, files, and findings so the agent can resume seamlessly.
images Optional paths to architecture diagrams or flow charts that help understand the tracing context.
model Currently in auto model selection mode. CRITICAL: When the user names a model, you MUST use that exact name unless the server rejects it. If no model is provided, you may use the `listmodels` tool to review options and select an appropriate match. Top models: gpt-5.2 (score 100, 400K ctx, thinking, code-gen); gpt-5.1-codex (score 100, 400K ctx, thinking, code-gen); gemini-2.5-pro (score 100, 1.0M ctx, thinking, code-gen); gemini-3-pro-preview (score 100, 1.0M ctx, thinking, code-gen); gpt-5.2-pro (score 100, 400K ctx, thinking, code-gen); +26 more via `listmodels`. required
trace_mode Type of tracing: 'ask' (default - prompts user to choose mode), 'precision' (execution flow) or 'dependencies' (structural relationships) required
target_description Description of what to trace and WHY. Include context about what you're trying to understand or analyze. required
testgen Creates comprehensive test suites with edge case coverage for specific functions, classes, or modules. Analyzes code paths, identifies failure modes, and generates framework-specific tests. Be specific about scope - target particular components rather than testing everything.

Parameters

step Test plan for this step. Step 1: outline how you'll analyse structure, business logic, critical paths, and edge cases. Later steps: record findings and new scenarios as they emerge. required
step_number Current test-generation step (starts at 1) — each step should build on prior work. required
total_steps Estimated number of steps needed for test planning; adjust as new scenarios appear. required
next_step_required True while more investigation or planning remains; set False when test planning is ready for expert validation. required
findings Summarise functionality, critical paths, edge cases, boundary conditions, error handling, and existing test patterns. Cover both happy and failure paths. required
files_checked Absolute paths of every file examined, including those ruled out.
relevant_files Absolute paths of code that requires new or updated tests (implementation, dependencies, existing test fixtures).
relevant_context Methods/functions identified as involved in the issue
issues_found Issues identified with severity levels during work
confidence Indicate your current confidence in the test generation assessment. Use: 'exploring' (starting analysis), 'low' (early investigation), 'medium' (some patterns identified), 'high' (strong understanding), 'very_high' (very strong understanding), 'almost_certain' (nearly complete test plan), 'certain' (100% confidence - test plan is thoroughly complete and all test scenarios are identified with no need for external model validation). Do NOT use 'certain' unless the test generation analysis is comprehensively complete, use 'very_high' or 'almost_certain' instead if not 100% sure. Using 'certain' means you have complete confidence locally and prevents external model validation.
hypothesis Current theory about issue/goal based on work
use_assistant_model Use assistant model for expert analysis after workflow steps. False skips expert analysis, relies solely on your personal investigation. Defaults to True for comprehensive validation.
temperature 0 = deterministic · 1 = creative.
thinking_mode Reasoning depth: minimal, low, medium, high, or max.
continuation_id Unique thread continuation ID for multi-turn conversations. Works across different tools. ALWAYS reuse the last continuation_id you were given—this preserves full conversation context, files, and findings so the agent can resume seamlessly.
images Optional absolute paths to diagrams or visuals that clarify the system under test.
model Currently in auto model selection mode. CRITICAL: When the user names a model, you MUST use that exact name unless the server rejects it. If no model is provided, you may use the `listmodels` tool to review options and select an appropriate match. Top models: gpt-5.2 (score 100, 400K ctx, thinking, code-gen); gpt-5.1-codex (score 100, 400K ctx, thinking, code-gen); gemini-2.5-pro (score 100, 1.0M ctx, thinking, code-gen); gemini-3-pro-preview (score 100, 1.0M ctx, thinking, code-gen); gpt-5.2-pro (score 100, 400K ctx, thinking, code-gen); +26 more via `listmodels`. required
challenge Prevents reflexive agreement by forcing critical thinking and reasoned analysis when a statement is challenged. Trigger automatically when a user critically questions, disagrees or appears to push back on earlier answers, and use it manually to sanity-check contentious claims.

Parameters

prompt Statement to scrutinize. If you invoke `challenge` manually, strip the word 'challenge' and pass just the statement. Automatic invocations send the full user message as-is; do not modify it. required
apilookup Use this tool automatically when you need current API/SDK documentation, latest version info, breaking changes, deprecations, migration guides, or official release notes. This tool searches authoritative sources (official docs, GitHub, package registries) to ensure up-to-date accuracy.

Parameters

prompt The API, SDK, library, framework, or technology you need current documentation, version info, breaking changes, or migration guidance for. required
listmodels Shows which AI model providers are configured, available model names, their aliases and capabilities.
version Get server version, configuration details, and list of available tools.
🚫
Security Tier
Reject
35
Score
out of 100
Scanned by
Orcorus Security Scanner
Mar 13, 2026

Security Review

Integration: Zen
Repository: https://github.com/beehiveinnovations/zen-mcp-server
Commit: latest
Scan Date: 2026-03-13 13:03 UTC

Security Score

35 / 100

Tier Classification

Reject

OWASP Alignment

OWASP Rubric

  • Standard: OWASP Top 10 (2021) aligned review
  • Core methodology: architecture context, trust boundaries, data-flow tracing, threat modeling, control verification, and evidence-backed validation
  • Key characteristics considered: exploitability, impact, likelihood, attacker preconditions, and business context

OWASP Security Category Mapping

  • A01 Broken Access Control: none
  • A02 Cryptographic Failures: 4 finding(s)
  • A03 Injection: 1 finding(s)
  • A04 Insecure Design: none
  • A05 Security Misconfiguration: 21 finding(s)
  • A06 Vulnerable and Outdated Components: 1 finding(s)
  • A07 Identification and Authentication Failures: none
  • A08 Software and Data Integrity Failures: none
  • A09 Security Logging and Monitoring Failures: 87 finding(s)
  • A10 Server-Side Request Forgery: none

Static Analysis Findings (Bandit)

High Severity

  • Use of weak MD5 hash for security. Consider usedforsecurity=False in tests/http_transport_recorder.py:326 (confidence: HIGH)
  • Use of weak MD5 hash for security. Consider usedforsecurity=False in tests/http_transport_recorder.py:389 (confidence: HIGH)
  • Use of weak MD5 hash for security. Consider usedforsecurity=False in tests/test_cassette_semantic_matching.py:75 (confidence: HIGH)
  • Use of weak MD5 hash for security. Consider usedforsecurity=False in tests/test_cassette_semantic_matching.py:76 (confidence: HIGH)

Medium Severity

  • Probable insecure usage of temp file/directory. in tests/conftest.py:17 (confidence: MEDIUM)
  • Possible binding to all interfaces. in tests/pii_sanitizer.py:98 (confidence: MEDIUM)
  • Probable insecure usage of temp file/directory. in tests/test_auto_mode.py:211 (confidence: MEDIUM)
  • Probable insecure usage of temp file/directory. in tests/test_chat_simple.py:62 (confidence: MEDIUM)
  • Probable insecure usage of temp file/directory. in tests/test_chat_simple.py:71 (confidence: MEDIUM)
  • Probable insecure usage of temp file/directory. in tests/test_chat_simple.py:79 (confidence: MEDIUM)
  • Probable insecure usage of temp file/directory. in tests/test_chat_simple.py:109 (confidence: MEDIUM)
  • Probable insecure usage of temp file/directory. in tests/test_chat_simple.py:127 (confidence: MEDIUM)
  • Probable insecure usage of temp file/directory. in tests/test_chat_simple.py:306 (confidence: MEDIUM)
  • Probable insecure usage of temp file/directory. in tests/test_chat_simple.py:316 (confidence: MEDIUM)
  • Probable insecure usage of temp file/directory. in tests/test_docker_claude_desktop_integration.py:190 (confidence: MEDIUM)
  • Probable insecure usage of temp file/directory. in tests/test_path_traversal_security.py:51 (confidence: MEDIUM)
  • Probable insecure usage of temp file/directory. in tests/test_path_traversal_security.py:52 (confidence: MEDIUM)
  • Probable insecure usage of temp file/directory. in tests/test_path_traversal_security.py:57 (confidence: MEDIUM)
  • Probable insecure usage of temp file/directory. in tests/test_path_traversal_security.py:58 (confidence: MEDIUM)
  • Possible binding to all interfaces. in tests/test_pii_sanitizer.py:96 (confidence: MEDIUM)
  • Possible SQL injection vector through string-based query construction. in tools/docgen.py:348 (confidence: LOW)
  • Audit url open for permitted schemes. Allowing use of file:/ or custom schemes is often unexpected. in tools/version.py:97 (confidence: HIGH)

Low Severity

  • Consider possible security implications associated with the subprocess module. in communication_simulator_test.py:74 (confidence: HIGH)
  • subprocess call - check for execution of untrusted input. in communication_simulator_test.py:448 (confidence: HIGH)
  • Consider possible security implications associated with the subprocess module. in docker/scripts/healthcheck.py:7 (confidence: HIGH)
  • Starting a process with a partial executable path in docker/scripts/healthcheck.py:22 (confidence: HIGH)
  • subprocess call - check for execution of untrusted input. in docker/scripts/healthcheck.py:22 (confidence: HIGH)
  • Try, Except, Pass detected. in providers/gemini.py:401 (confidence: HIGH)
  • Try, Except, Continue detected. in providers/openai_compatible.py:84 (confidence: HIGH)
  • Try, Except, Pass detected. in providers/openai_compatible.py:797 (confidence: HIGH)
  • Try, Except, Pass detected. in server.py:580 (confidence: HIGH)
  • Try, Except, Pass detected. in server.py:583 (confidence: HIGH)
  • Try, Except, Pass detected. in server.py:662 (confidence: HIGH)
  • Try, Except, Pass detected. in server.py:756 (confidence: HIGH)
  • Try, Except, Pass detected. in server.py:772 (confidence: HIGH)
  • Try, Except, Pass detected. in server.py:872 (confidence: HIGH)
  • Try, Except, Pass detected. in server.py:1062 (confidence: HIGH)
  • Try, Except, Pass detected. in server.py:1285 (confidence: HIGH)
  • Consider possible security implications associated with the subprocess module. in simulator_tests/base_test.py:11 (confidence: HIGH)
  • subprocess call - check for execution of untrusted input. in simulator_tests/base_test.py:169 (confidence: HIGH)
  • subprocess call - check for execution of untrusted input. in simulator_tests/base_test.py:276 (confidence: HIGH)
  • Consider possible security implications associated with the subprocess module. in simulator_tests/log_utils.py:10 (confidence: HIGH)

Hardcoded Secrets

3 potential hardcoded secret(s) detected.

Build Status

SKIPPED

Build step was skipped to avoid running untrusted build commands by default.

Tests

Detected (pytest)

Documentation

README: Present
Dependency file: Present

AI Security Review

Security Code Review Report for repository: Zen

1) OWASP Review Methodology Applied
- Orientation: I inspected repository layout, the main server entry (server.py), providers, tools, clink agents, and utilities. I reviewed static analysis notes and prioritized files flagged by the scanner.
- Entry Points: I examined server.py (MCP stdo server & tool registry), tools (SimpleTool / BaseTool), provider implementations (providers/openai_compatible.py, providers/custom.py), clink agent execution (clink/agents/base.py and clink/registry.py), and file access & path validation utilities (utils/file_utils.py and utils/security_config.py).
- Data flows: Traced user-supplied/external inputs (MCP tool arguments including absolute_file_paths, CUSTOM_API_URL, CLI client config files) through validation and into sinks (file I/O, subprocess execution, network calls).
- Trust boundaries & entry points: MCP stdio messages -> server.call_tool -> tool code (SimpleTool/BaseTool) -> provider/client resolvers -> provider network calls (OpenAI/OpenRouter/Custom) and clink -> subprocess exec of configured CLIs.
- Threat modelling: Focused on path traversal, arbitrary command execution, SSRF, secret leakage in logs, insecure configuration loading, and unsafe deserialization in tests.
- Verification: Confirmed behavior by reading critical code implementing validation, path handling, provider base URL validation, CLI command execution, and logging sanitization.

2) OWASP Top 10 2021 Category Mapping
- A01: Broken Access Control: clink registry/agent executing configured local commands (clink/registry.py, clink/agents/base.py)
- A02: Cryptographic Failures: not directly observed.
- A03: Injection: potential command execution / CLI injection based on configured commands (clink).
- A04: Insecure Design: permissive acceptance of absolute paths in various config resolution functions; design choices allow operators to configure execution of arbitrary local commands.
- A05: Security Misconfiguration: .env override (utils/env.py) and logging configuration may leak sensitive information if mishandled; default debug logging enabled.
- A06: Vulnerable and Outdated Components: dependency review not exhaustively performed here; but code uses httpx, OpenAI SDK; pinned versions should be validated in pyproject/requirements.
- A07: Identification and Authentication Failures: not prominent in code reviewed (relying on environment-provided API keys), but operator-configured keys may be accidentally exposed in logs if not sanitized.
- A08: Software and Data Integrity Failures: no runtime plugin signing / integrity verification for custom provider endpoints; ModelProviderRegistry allows custom provider factories.
- A09: Security Logging and Monitoring Failures: some try/except:pass swallowing in critical shutdown/cleanup code that could hide failures (server.py cleanup_providers); however proper mcp_activity logging exists.
- A10: Server-Side Request Forgery (SSRF): provider base_url (_validate_base_url/_is_localhost_url) validation is limited; custom endpoints (CUSTOM_API_URL) can point to internal hosts and will be used (providers/openai_compatible.py, providers/custom.py).

3) Critical Vulnerabilities (RCE, injection, auth bypass, unsafe deserialization)
- No immediate unauthenticated remote RCE was found in code executed on MCP server directly from untrusted network inputs. The critical risks are configuration-driven local command execution that can run arbitrary local programs when CLI clients are configured.
- Unsafe deserialization: I found test code using pickle (simulator_tests/test_secaudit_validation.py), but only in tests. No production code unpickling untrusted data was found.

4) High Severity Issues
1. Arbitrary local command execution via CLINk configuration (potential local RCE / command injection)
- Files: clink/agents/base.py (create_subprocess_exec usage) and clink/registry.py (configuration parsing)
- Evidence: clink/registry.py -> _resolve_executable returns shlex.split(command) (no whitelist or strict validation) and configs are loaded from conf/cli_clients and user config directories (ClinkRegistry._iter_config_files). clink/agents/base.py then resolves the executable via shutil.which and executes the full command via asyncio.create_subprocess_exec (safe from shell=True injection, but will run whatever the configured executable+args are). See clink/agents/base.py at the process.launch call (search result: clink/agents/base.py:111) and clink/registry.py:_resolve_executable (search result reference).
- Severity: High (A01/A03)
- Exploitability: High if attacker can influence config files (e.g., user config dir or environment that points to config) or if operator config contains malicious/untrusted values. An attacker who can write a config JSON can cause arbitrary local execution with operator privileges.
- Remediation:
- Restrict CLIs that can be executed to a configured allow-list in code or config (whitelist of allowed executables/paths), or require executables to be absolute paths under an allowed directory.
- Validate and canonicalize executable paths and arguments during config load; disallow dangerous flags or redirections and disallow arbitrary output flag templates that may write to arbitrary paths without checks.
- Require operator confirmation / secure deployment process for CLI client definitions and treat them as high privilege.
- Consider running CLI agents in a sandboxed process / chroot or under reduced privileges.
- Suggested code changes:
- clink/registry.py::_resolve_executable: validate against a whitelist and force absolute/realpath checks. E.g. replace shlex.split(command) with parsing + validation. Add logging when config overrides occur.
- clink/agents/base.py: before executing, re-validate resolved_executable is under a safe directory and that role.args/config_args are within allowed set. (clink/agents/base.py around create_subprocess_exec call at line ~111)

  1. Server allowed to call arbitrary CUSTOM_API_URL (SSRF-like / internal network access)
  2. Files: providers/openai_compatible.py (base_url validation uses urlparse but does not perform DNS resolution to ban internal addresses), providers/custom.py (initialization) and server.py (configure_providers accepts CUSTOM_API_URL from env). Specifically, _validate_base_url (providers/openai_compatible.py) checks scheme/hostname/port only; _is_localhost_url detects localhost/private IPs but does not block them.
  3. Evidence: providers/openai_compatible.py: _validate_base_url only checks scheme, hostname, and port (search match). clients are then created with base_url assigned to OpenAI client (client_kwargs['base_url']). CUSTOM_API_URL is used without network isolation. Search results: providers/openai_compatible.py:_validate_base_url and _is_localhost_url.
  4. Severity: High (A10 SSRF)
  5. Exploitability: Medium - requires attacker control of CUSTOM_API_URL environment variable (or for multi-tenant deployments where an attacker can influence it). If that is possible, attacker can route model calls to internal services or exfiltrate data.
  6. Remediation:
    • Harden URL validation: perform DNS resolution and block internal/private IP ranges by default unless explicitly whitelisted. Validate against e.g., ip.is_private, ip.is_loopback, link-local ranges, and also disallow IPv6 internal ranges unless explicitly allowed.
    • Add an explicit operator opt-in to allow local/private addresses for CUSTOM_API_URL, and log/alert when such addresses are configured.
    • Consider adding an allowlist of safe hostnames or require HTTPS with certificate verification for remote endpoints.

5) Medium Severity Issues
1. Potential log leakage of sensitive data
- Files: server.py logging calls, providers/openai_compatible.py _sanitize_for_logging mitigates API key logging but other fields may leak (server.py uses debug/info extensively); env override loads .env and enables overriding system env (utils/env.py). There are many spots where large prompt content or client_info are logged. Example: server.py logs incoming client info to mcp_activity.
- Severity: Medium (A09/A05)
- Remediation:
- Ensure all logs strip or mask potential API keys or sensitive tokens beyond 'api_key' and 'authorization' keys. Consider a centralized sanitizer for any dict logged.
- Default log level to INFO in production (server code uses LOG_LEVEL env that defaults to DEBUG) and document safe settings in README. Ensure log files are created with secure file permissions (600) and rotate/lock files appropriately.

  1. Path resolution trusts absolute paths in config and prompt files
  2. Files: clink/registry.py:_resolve_prompt_path/_resolve_path allows absolute candidate paths to be returned directly (no dangerous path checks). server uses BaseTool.get_input_schema and tools call read_file_content which enforces absolute paths and checks with resolve_and_validate_path.
  3. Evidence: clink/registry.py:_resolve_prompt_path -> _resolve_path simply returns absolute Path directly; no cross-check to disallow system prompt_path pointing to sensitive system files. (clink/registry.py:_resolve_prompt_path/_resolve_path documented in file.)
  4. Severity: Medium (A04/A01)
  5. Remediation:

    • Validate prompt_path is within expected configuration directories or ensure it doesn’t point to system-critical files. When accepting absolute paths from config, do explicit allow-listing or canonicalization checks.
  6. Greedy JSON extraction from exception strings and parsing heuristics

  7. Files: providers/openai_compatible.py around line ~770: code searches exception text via re.search(r"{.*}", str(error)) and then uses ast.literal_eval(json_like_str) followed by fallback of replacing single quotes to double quotes and json.loads. The regex is greedy and may capture trailing content; literal_eval is safer than eval but feeding untrusted arbitrary string from remote model/provider exceptions may still cause unexpected parsing failures.
  8. Severity: Medium-Low (A03/A09)
  9. Remediation:
    • Use a non-greedy regex and robust JSON extraction (e.g., use a small parser or try to find balanced braces), and prefer json.loads with strict validation. If literal_eval use is retained, ensure the string is strictly validated to be a literal.
    • Add exception handling and avoid depending on heuristics that may silently mask the root error.

6) Low Severity Issues / Best-practice gaps
1. Swallowed exceptions and “except: pass” in cleanup code
- Files: server.py cleanup_providers (atexit handler) and various try/except: pass patterns reported by static analysis. Swallowing errors at shutdown can hide resource closure issues.
- Severity: Low (A09)
- Remediation: Log exceptions at debug level instead of silently passing.

  1. Test-only insecure code flagged by static analysis (subprocess usage, pickle)
  2. Files: simulator_tests and tests contain subprocess usage and pickle.loads in test code. These are test-only and not part of production; confirm they remain confined to test suites and are not used in production endpoints.
  3. Severity: Low (test-only)
  4. Remediation: Keep these in test suites and do not enable in production.

7) Key Risk Characteristics (Exploitability, Impact, Likelihood, Preconditions)
- Arbitrary local CLI execution (clink): Exploitability: High if attacker can modify CLI config or place files in the user config path. Impact: High (local code execution as service user, exfiltrate secrets, modify workspace). Likelihood: Medium in an environment where multiple users can drop files into user config directories; Low in single-operator deployments. Preconditions: Ability to write or modify CLI client config JSON (USER_CONFIG_DIR, conf/cli_clients or environment override path).
- SSRF via CUSTOM_API_URL: Exploitability: Medium (requires ability to set env var or influence env). Impact: Moderate-High (information disclosure from internal services, lateral movement). Likelihood: Low in secure deployments; Higher in ephemeral/containerized CI or misconfigured deployments that pull env from untrusted sources. Preconditions: Ability to set CUSTOM_API_URL in environment or .env file.
- Log leakage: Exploitability: Medium (an attacker who can read logs). Impact: Moderate (exposure of API keys, prompts). Likelihood: Medium (debug logs default). Preconditions: Access to logs or ability to craft data that gets logged.
- Path access from configs: Exploitability: Medium-Low (requires config modification). Impact: Moderate (leakage of system files used as prompts). Preconditions: ability to specify absolute prompt files in CLI config or to place prompt files in registries.

8) Positive Security Practices Observed
- File access hardening: utils/file_utils.resolve_and_validate_path enforces absolute paths, forbids dangerous system roots and home-root scanning, resolves symlinks, and checks against DANGEROUS_PATHS (utils/security_config.py). This is a strong defense in depth for file access from MCP tool requests. (utils/file_utils.py: resolve_and_validate_path, utils/security_config.py: is_dangerous_path)
- Logging sanitization: providers/openai_compatible.py implements _sanitize_for_logging to remove api_key and authorization entries and truncate long text before logging. This reduces risk of credential leakage in many API call logs.
- Timeout and proxy hardening: OpenAI-compatible provider avoids proxy env vars when creating HTTP client and configures reasonable timeouts, reducing some SSRF/proxy abuse risk.
- Prompt size validation: BaseTool._validate_token_limit enforces MCP_PROMPT_SIZE_LIMIT for user content crossing MCP boundary.

9) Recommendations (concrete fixes with file:line references)
NOTE: Line numbers are approximate and come from code locations discovered during review; follow references by file and function names below.

Critical / High priority fixes
- Harden CLINK command execution
- Files: clink/registry.py::_resolve_executable (where shlex.split is used); clink/agents/base.py (process creation at asyncio.create_subprocess_exec near line ~111).
- Fix: Implement a whitelist of allowed executables or require absolute path and validate it against a safe directory. Validate and sanitize arguments in registry load instead of executing them blindly. Example: on registry load, validate resolved_executable = Path(shutil.which(executable_name)).resolve(); ensure it is under /usr/bin or an operator-defined safe list; otherwise reject config with explicit error.
- OWASP mapping: A01 (Broken Access Control), A03 (Injection)

  • Harden provider base_url handling (SSRF)
  • Files: providers/openai_compatible.py:_validate_base_url and _is_localhost_url; server.py configure_providers (CUSTOM_API_URL handling at server.py:~479).
  • Fix: Extend _validate_base_url to perform DNS resolution and reject addresses in private / link-local / loopback ranges by default, unless an explicit opt-in is set (e.g., CUSTOM_API_ALLOW_PRIVATE=true). Example: resolve hostname to IP(s) and for each ip do ipaddress.ip_address(ip).is_private or is_loopback checks; if so, require opt-in setting.
  • OWASP mapping: A10 (SSRF)

Medium priority fixes
- Improve logging sanitization and default log level
- Files: server.py (logging setup), providers/openai_compatible.py:_sanitize_for_logging
- Fix: Ensure all logged dictionaries pass through a sanitizer that strips common secrets (API tokens, Authorization headers, environment secrets) and avoid logging full prompts or user-provided files at DEBUG in production. Default LOG_LEVEL to INFO in production or detect CI.
- OWASP mapping: A05 / A09

  • Avoid fragile JSON extraction from exception text
  • File: providers/openai_compatible.py (regex extraction around line ~770)
  • Fix: Replace the greedy re.search(r"{.*}", ...) with a robust parser: try json.loads directly on candidate substrings, use a stack-based brace matching to find the first balanced JSON object, and do not attempt ast.literal_eval fallback unless absolutely necessary. Surround with try/except and log parsing failures, not silently converting malformed text.
  • OWASP mapping: A03

Low priority fixes / best-practices
- Replace silent except: pass with logged debug exceptions in server cleanup
- Files: server.py cleanup_providers and other swallowed-exception sites
- Fix: Log exception stacktrace at DEBUG when cleanup fails to aid troubleshooting.

  • Document operator responsibilities for CLI config and .env
  • Files: README.md, SECURITY.md
  • Fix: Add explicit warnings that CLI client configurations are powerful and must be managed as high-privilege config; document secure defaults for LOG_LEVEL and file permissions of logs and .env.

10) Next Tier Upgrade Plan (integration security posture)
- Current likely tier: Silver
- Rationale: The codebase demonstrates many strong security practices (robust file path validation, prompt-size checks, logging sanitization hooks, timeout/proxy hardening). However, the ability to execute configured CLIs without whitelisting and permissive handling of custom provider endpoints and config-sourced absolute paths are significant configuration-driven risks.
- Target next tier: Gold
- Required prioritized actions to reach Gold (highest priority first):
1. Harden CLINK execution path (whitelist executables, validate args, sandbox execution). (High priority)
2. Harden CUSTOM_API_URL and provider base_url validation (DNS resolution, reject internal ranges by default, opt-in for localhost). (High priority)
3. Centralize logging sanitization and default to INFO in production; ensure logs are created with secure permissions. (Medium)
4. Validate configuration file paths and disallow using absolute system file paths as prompts or CLI role files unless explicitly allowed. (Medium)
5. Add operational documentation and deployment security checks (CI scanning of env and config files). (Low)

Summary of concrete file:line remediation pointers (as discovered during review):
- clink/agents/base.py (around line ~111): validate resolved_executable and sanitize arguments before asyncio.create_subprocess_exec. Implement whitelist and sandboxing.
- clink/registry.py::_resolve_executable (function): do not accept arbitrary commands via shlex.split without validation; enforce absolute paths or whitelisted names.
- providers/openai_compatible.py (around lines ~752-780): replace greedy JSON extraction, avoid ast.literal_eval heuristics; improve _validate_base_url to resolve hostnames and block internal IPs by default.
- utils/file_utils.py: resolve_and_validate_path (start at function def around utils/file_utils.py:282) is a strong control — ensure all code paths that read files call this function (clink registry when resolving prompt paths should call resolve_and_validate_path or similar check).
- server.py cleanup_providers: remove silent suppression of exceptions; log at debug level.

Final notes & actionable next steps for maintainers
- Short-term (1-2 days): Implement quick hardening steps: (a) prevent CLI configs from referencing absolute prompt files outside config directories; (b) default LOG_LEVEL to INFO and ensure API keys are removed from logs.
- Medium-term (1-2 weeks): Implement CLIAgent allow-list or sandboxing; implement DNS-based validation for CUSTOM_API_URL and flag/risk when internal addresses are configured.
- Long-term (1-2 months): Perform dependency CVE scan (pyproject/requirements), add runtime tests for SSRF and clink configuration safety, adopt signed configuration or RBAC for config editing in multi-user contexts.

If you want I can produce small code patches / diff suggestions for the highest-priority items (CLINK command validation, provider base_url DNS checks, logging sanitization) referencing exact lines and proposed code.

-- End of review --

Summary

Security Score: 35/100 (Reject)
Static analysis found 4 high, 18 medium, and 2851 low severity issues.
Build step skipped for safety.
Tests detected.

0.0
0 reviews
5
0%
4
0%
3
0%
2
0%
1
0%

Sign in to leave a review

No reviews yet — be the first!

Connect →
0.0
★ Rating
18
Tools
0
Installs

Configuration

OPENROUTER_API_KEY required 🔒 password
OPENROUTER_API_KEY

At least one API key is required for Zen to connect to AI model providers

GEMINI_API_KEY required 🔒 password
GEMINI_API_KEY

At least one API key is required for Zen to connect to AI model providers

OPENAI_API_KEY required 🔒 password
OPENAI_API_KEY

At least one API key is required for Zen to connect to AI model providers

XAI_API_KEY required 🔒 password
XAI_API_KEY

At least one API key is required for Zen to connect to AI model providers

Docker Image

Docker Hub
mcp/zen

Published by github.com/beehiveinnovations