Docker Hub MCP

Zen

by github.com/beehiveinnovations · Devops

0.0 · 0 reviews

0 installs · 18 tools

Bridges multiple AI models and CLIs, enabling orchestrated workflows across Claude Code, Gemini CLI, Codex CLI, and other AI development tools.

PAL MCP: Many Workflows. One Context.

Your AI's PAL – a Provider Abstraction Layer
_{Formerly known as Zen MCP} [PAL in action](https://github.com/user-attachments/assets/0d26061e-5f21-4ab1-b7d0-f883ddc2c3da) 👉 **[Watch more examples](#-watch-tools-in-action)** ### Your CLI + Multiple Models = Your AI Dev Team **Use the 🤖 CLI you love:** [Claude Code](https://www.anthropic.com/claude-code) · [Gemini CLI](https://github.com/google-gemini/gemini-cli) · [Codex CLI](https://github.com/openai/codex) · [Qwen Code CLI](https://qwenlm.github.io/qwen-code-docs/) · [Cursor](https://cursor.com) · _and more_ **With multiple models within a single prompt:** Gemini · OpenAI · Anthropic · Grok · Azure · Ollama · OpenRouter · DIAL · On-Device Model

🆕 Now with CLI-to-CLI Bridge

The new clink (CLI + Link) tool connects external AI CLIs directly into your workflow:

Connect external CLIs like Gemini CLI, Codex CLI, and Claude Code directly into your workflow
CLI Subagents - Launch isolated CLI instances from within your current CLI! Claude Code can spawn Codex subagents, Codex can spawn Gemini CLI subagents, etc. Offload heavy tasks (code reviews, bug hunting) to fresh contexts while your main session's context window remains unpolluted. Each subagent returns only final results.
Context Isolation - Run separate investigations without polluting your primary workspace
Role Specialization - Spawn planner, codereviewer, or custom role agents with specialized system prompts
Full CLI Capabilities - Web search, file inspection, MCP tool access, latest documentation lookups
Seamless Continuity - Sub-CLIs participate as first-class members with full conversation context between tools

# Codex spawns Codex subagent for isolated code review in fresh context
clink with codex codereviewer to audit auth module for security issues
# Subagent reviews in isolation, returns final report without cluttering your context as codex reads each file and walks the directory structure

# Consensus from different AI models → Implementation handoff with full context preservation between tools
Use consensus with gpt-5 and gemini-pro to decide: dark mode or offline support next
Continue with clink gemini - implement the recommended feature
# Gemini receives full debate context and starts coding immediately

👉 Learn more about clink

Why PAL MCP?

Why rely on one AI model when you can orchestrate them all?

A Model Context Protocol server that supercharges tools like Claude Code, Codex CLI, and IDE clients such as Cursor or the Claude Dev VS Code extension. PAL MCP connects your favorite AI tool to multiple AI models for enhanced code analysis, problem-solving, and collaborative development.

True AI Collaboration with Conversation Continuity

PAL supports conversation threading so your CLI can discuss ideas with multiple AI models, exchange reasoning, get second opinions, and even run collaborative debates between models to help you reach deeper insights and better solutions.

Your CLI always stays in control but gets perspectives from the best AI for each subtask. Context carries forward seamlessly across tools and models, enabling complex workflows like: code reviews with multiple models → automated planning → implementation → pre-commit validation.

You're in control. Your CLI of choice orchestrates the AI team, but you decide the workflow. Craft powerful prompts that bring in Gemini Pro, GPT 5, Flash, or local offline models exactly when needed.

Reasons to Use PAL MCP

A typical workflow with Claude Code as an example: 1. **Multi-Model Orchestration** - Claude coordinates with Gemini Pro, O3, GPT-5, and 50+ other models to get the best analysis for each task 2. **Context Revival Magic** - Even after Claude's context resets, continue conversations seamlessly by having other models "remind" Claude of the discussion 3. **Guided Workflows** - Enforces systematic investigation phases that prevent rushed analysis and ensure thorough code examination 4. **Extended Context Windows** - Break Claude's limits by delegating to Gemini (1M tokens) or O3 (200K tokens) for massive codebases 5. **True Conversation Continuity** - Full context flows across tools and models - Gemini remembers what O3 said 10 steps ago 6. **Model-Specific Strengths** - Extended thinking with Gemini Pro, blazing speed with Flash, strong reasoning with O3, privacy with local Ollama 7. **Professional Code Reviews** - Multi-pass analysis with severity levels, actionable feedback, and consensus from multiple AI experts 8. **Smart Debugging Assistant** - Systematic root cause analysis with hypothesis tracking and confidence levels 9. **Automatic Model Selection** - Claude intelligently picks the right model for each subtask (or you can specify) 10. **Vision Capabilities** - Analyze screenshots, diagrams, and visual content with vision-enabled models 11. **Local Model Support** - Run Llama, Mistral, or other models locally for complete privacy and zero API costs 12. **Bypass MCP Token Limits** - Automatically works around MCP's 25K limit for large prompts and responses **The Killer Feature:** When Claude's context resets, just ask to "continue with O3" - the other model's response magically revives Claude's understanding without re-ingesting documents! #### Example: Multi-Model Code Review Workflow 1. `Perform a codereview using gemini pro and o3 and use planner to generate a detailed plan, implement the fixes and do a final precommit check by continuing from the previous codereview` 2. This triggers a [`codereview`](docs/tools/codereview.md) workflow where Claude walks the code, looking for all kinds of issues 3. After multiple passes, collects relevant code and makes note of issues along the way 4. Maintains a `confidence` level between `exploring`, `low`, `medium`, `high` and `certain` to track how confidently it's been able to find and identify issues 5. Generates a detailed list of critical -> low issues 6. Shares the relevant files, findings, etc with **Gemini Pro** to perform a deep dive for a second [`codereview`](docs/tools/codereview.md) 7. Comes back with a response and next does the same with o3, adding to the prompt if a new discovery comes to light 8. When done, Claude takes in all the feedback and combines a single list of all critical -> low issues, including good patterns in your code. The final list includes new findings or revisions in case Claude misunderstood or missed something crucial and one of the other models pointed this out 9. It then uses the [`planner`](docs/tools/planner.md) workflow to break the work down into simpler steps if a major refactor is required 10. Claude then performs the actual work of fixing highlighted issues 11. When done, Claude returns to Gemini Pro for a [`precommit`](docs/tools/precommit.md) review All within a single conversation thread! Gemini Pro in step 11 _knows_ what was recommended by O3 in step 7! Taking that context and review into consideration to aid with its final pre-commit review. **Think of it as Claude Code _for_ Claude Code.** This MCP isn't magic. It's just **super-glue**. > **Remember:** Claude stays in full control — but **YOU** call the shots. > PAL is designed to have Claude engage other models only when needed — and to follow through with meaningful back-and-forth. > **You're** the one who crafts the powerful prompt that makes Claude bring in Gemini, Flash, O3 — or fly solo. > You're the guide. The prompter. The puppeteer. > #### You are the AI - **Actually Intelligent**.

Recommended AI Stack

For Claude Code Users

For best results when using [Claude Code](https://claude.ai/code): - **Sonnet 4.5** - All agentic work and orchestration - **Gemini 3.0 Pro** OR **GPT-5.2 / Pro** - Deep thinking, additional code reviews, debugging and validations, pre-commit analysis

For Codex Users

For best results when using [Codex CLI](https://developers.openai.com/codex/cli): - **GPT-5.2 Codex Medium** - All agentic work and orchestration - **Gemini 3.0 Pro** OR **GPT-5.2-Pro** - Deep thinking, additional code reviews, debugging and validations, pre-commit analysis

Quick Start (5 minutes)

Prerequisites: Python 3.10+, Git, uv installed

1. Get API Keys (choose one or more): - OpenRouter - Access multiple models with one API - Gemini - Google's latest models - OpenAI - O3, GPT-5 series - Azure OpenAI - Enterprise deployments of GPT-4o, GPT-4.1, GPT-5 family - X.AI - Grok models - DIAL - Vendor-agnostic model access - Ollama - Local models (free)

2. Install (choose one):

Option A: Clone and Automatic Setup (recommended)

git clone https://github.com/BeehiveInnovations/pal-mcp-server.git
cd pal-mcp-server

# Handles everything: setup, config, API keys from system environment. 
# Auto-configures Claude Desktop, Claude Code, Gemini CLI, Codex CLI, Qwen CLI
# Enable / disable additional settings in .env
./run-server.sh

Option B: Instant Setup with uvx

// Add to ~/.claude/settings.json or .mcp.json
// Don't forget to add your API keys under env
{
  "mcpServers": {
    "pal": {
      "command": "bash",
      "args": ["-c", "for p in $(which uvx 2>/dev/null) $HOME/.local/bin/uvx /opt/homebrew/bin/uvx /usr/local/bin/uvx uvx; do [ -x \"$p\" ] && exec \"$p\" --from git+https://github.com/BeehiveInnovations/pal-mcp-server.git pal-mcp-server; done; echo 'uvx not found' >&2; exit 1"],
      "env": {
        "PATH": "/usr/local/bin:/usr/bin:/bin:/opt/homebrew/bin:~/.local/bin",
        "GEMINI_API_KEY": "your-key-here",
        "DISABLED_TOOLS": "analyze,refactor,testgen,secaudit,docgen,tracer",
        "DEFAULT_MODEL": "auto"
      }
    }
  }
}

3. Start Using!

"Use pal to analyze this code for security issues with gemini pro"
"Debug this error with o3 and then get flash to suggest optimizations"
"Plan the migration strategy with pal, get consensus from multiple models"
"clink with cli_name=\"gemini\" role=\"planner\" to draft a phased rollout plan"

👉 Complete Setup Guide with detailed installation, configuration for Gemini / Codex / Qwen, and troubleshooting 👉 Cursor & VS Code Setup for IDE integration instructions 📺 Watch tools in action to see real-world examples

Provider Configuration

PAL activates any provider that has credentials in your .env. See .env.example for deeper customization.

Core Tools

Note: Each tool comes with its own multi-step workflow, parameters, and descriptions that consume valuable context window space even when not in use. To optimize performance, some tools are disabled by default. See Tool Configuration below to enable them.

Collaboration & Planning (Enabled by default) - clink - Bridge requests to external AI CLIs (Gemini planner, codereviewer, etc.) - chat - Brainstorm ideas, get second opinions, validate approaches. With capable models (GPT-5.2 Pro, Gemini 3.0 Pro), generates complete code / implementation - thinkdeep - Extended reasoning, edge case analysis, alternative perspectives - planner - Break down complex projects into structured, actionable plans - consensus - Get expert opinions from multiple AI models with stance steering

Code Analysis & Quality - debug - Systematic investigation and root cause analysis - precommit - Validate changes before committing, prevent regressions - codereview - Professional reviews with severity levels and actionable feedback - analyze (disabled by default - enable) - Understand architecture, patterns, dependencies across entire codebases

Development Tools (Disabled by default - enable) - refactor - Intelligent code refactoring with decomposition focus - testgen - Comprehensive test generation with edge cases - secaudit - Security audits with OWASP Top 10 analysis - docgen - Generate documentation with complexity analysis

Utilities - apilookup - Forces current-year API/SDK documentation lookups in a sub-process (saves tokens within the current context window), prevents outdated training data responses - challenge - Prevent "You're absolutely right!" responses with critical analysis - tracer (disabled by default - enable) - Static analysis prompts for call-flow mapping

👉 Tool Configuration

### Default Configuration To optimize context window usage, only essential tools are enabled by default: **Enabled by default:** - `chat`, `thinkdeep`, `planner`, `consensus` - Core collaboration tools - `codereview`, `precommit`, `debug` - Essential code quality tools - `apilookup` - Rapid API/SDK information lookup - `challenge` - Critical thinking utility **Disabled by default:** - `analyze`, `refactor`, `testgen`, `secaudit`, `docgen`, `tracer` ### Enabling Additional Tools To enable additional tools, remove them from the `DISABLED_TOOLS` list: **Option 1: Edit your .env file**

# Default configuration (from .env.example)
DISABLED_TOOLS=analyze,refactor,testgen,secaudit,docgen,tracer

# To enable specific tools, remove them from the list
# Example: Enable analyze tool
DISABLED_TOOLS=refactor,testgen,secaudit,docgen,tracer

# To enable ALL tools
DISABLED_TOOLS=

**Option 2: Configure in MCP settings**

// In ~/.claude/settings.json or .mcp.json
{
  "mcpServers": {
    "pal": {
      "env": {
        // Tool configuration
        "DISABLED_TOOLS": "refactor,testgen,secaudit,docgen,tracer",
        "DEFAULT_MODEL": "pro",
        "DEFAULT_THINKING_MODE_THINKDEEP": "high",

        // API configuration
        "GEMINI_API_KEY": "your-gemini-key",
        "OPENAI_API_KEY": "your-openai-key",
        "OPENROUTER_API_KEY": "your-openrouter-key",

        // Logging and performance
        "LOG_LEVEL": "INFO",
        "CONVERSATION_TIMEOUT_HOURS": "6",
        "MAX_CONVERSATION_TURNS": "50"
      }
    }
  }
}

**Option 3: Enable all tools**

// Remove or empty the DISABLED_TOOLS to enable everything
{
  "mcpServers": {
    "pal": {
      "env": {
        "DISABLED_TOOLS": ""
      }
    }
  }
}

**Note:** - Essential tools (`version`, `listmodels`) cannot be disabled - After changing tool configuration, restart your Claude session for changes to take effect - Each tool adds to context window usage, so only enable what you need

📺 Watch Tools In Action

Chat Tool - Collaborative decision making and multi-turn conversations

**Picking Redis vs Memcached:** [Chat Redis or Memcached_web.webm](https://github.com/user-attachments/assets/41076cfe-dd49-4dfc-82f5-d7461b34705d) **Multi-turn conversation with continuation:** [Chat With Gemini_web.webm](https://github.com/user-attachments/assets/37bd57ca-e8a6-42f7-b5fb-11de271e95db)

Consensus Tool - Multi-model debate and decision making

**Multi-model consensus debate:** [PAL Consensus Debate](https://github.com/user-attachments/assets/76a23dd5-887a-4382-9cf0-642f5cf6219e)

PreCommit Tool - Comprehensive change validation

**Pre-commit validation workflow:**

API Lookup Tool - Current vs outdated API documentation

**Without PAL - outdated APIs:** [API without PAL](https://github.com/user-attachments/assets/01a79dc9-ad16-4264-9ce1-76a56c3580ee) **With PAL - current APIs:** [API with PAL](https://github.com/user-attachments/assets/5c847326-4b66-41f7-8f30-f380453dce22)

Challenge Tool - Critical thinking vs reflexive agreement

**Without PAL:** ![without_pal@2x](https://github.com/user-attachments/assets/64f3c9fb-7ca9-4876-b687-25e847edfd87) **With PAL:** ![with_pal@2x](https://github.com/user-attachments/assets/9d72f444-ba53-4ab1-83e5-250062c6ee70)

Key Features

AI Orchestration - Auto model selection - Claude picks the right AI for each task - Multi-model workflows - Chain different models in single conversations - Conversation continuity - Context preserved across tools and models - Context revival - Continue conversations even after context resets

Model Support - Multiple providers - Gemini, OpenAI, Azure, X.AI, OpenRouter, DIAL, Ollama - Latest models - GPT-5, Gemini 3.0 Pro, O3, Grok-4, local Llama - Thinking modes - Control reasoning depth vs cost - Vision support - Analyze images, diagrams, screenshots

Developer Experience - Guided workflows - Systematic investigation prevents rushed analysis - Smart file handling - Auto-expand directories, manage token limits - Web search integration - Access current documentation and best practices - Large prompt support - Bypass MCP's 25K token limit

Example Workflows

Multi-model Code Review:

"Perform a codereview using gemini pro and o3, then use planner to create a fix strategy"

→ Claude reviews code systematically → Consults Gemini Pro → Gets O3's perspective → Creates unified action plan

Collaborative Debugging:

"Debug this race condition with max thinking mode, then validate the fix with precommit"

→ Deep investigation → Expert analysis → Solution implementation → Pre-commit validation

Architecture Planning:

"Plan our microservices migration, get consensus from pro and o3 on the approach"

→ Structured planning → Multiple expert opinions → Consensus building → Implementation roadmap

👉 Advanced Usage Guide for complex workflows, model configuration, and power-user features

Quick Links

📖 Documentation - Docs Overview - High-level map of major guides - Getting Started - Complete setup guide - Tools Reference - All tools with examples - Advanced Usage - Power user features - Configuration - Environment variables, restrictions - Adding Providers - Provider-specific setup (OpenAI, Azure, custom gateways) - Model Ranking Guide - How intelligence scores drive auto-mode suggestions

🔧 Setup & Support - WSL Setup - Windows users - Troubleshooting - Common issues - Contributing - Code standards, PR process

License

Apache 2.0 License - see LICENSE file for details.

Acknowledgments

Built with the power of Multi-Model AI collaboration 🤝 - Actual Intelligence by real Humans - MCP (Model Context Protocol) - Codex CLI - Claude Code - Gemini - OpenAI - Azure OpenAI

Star History

chat General chat and collaborative thinking partner for brainstorming, development discussion, getting second opinions, and exploring ideas. Use for ideas, validations, questions, and thoughtful explanations.

Parameters

prompt Your question or idea for collaborative thinking to be sent to the external model. Provide detailed context, including your goal, what you've tried, and any specific challenges. WARNING: Large inline code must NOT be shared in prompt. Provide full-path to files on disk as separate parameter. required

absolute_file_paths Full, absolute file paths to relevant code in order to share with external model

images Image paths (absolute) or base64 strings for optional visual context.

working_directory_absolute_path Absolute path to an existing directory where generated code artifacts can be saved. required

model Currently in auto model selection mode. CRITICAL: When the user names a model, you MUST use that exact name unless the server rejects it. If no model is provided, you may use the `listmodels` tool to review options and select an appropriate match. Top models: gpt-5.2 (score 100, 400K ctx, thinking, code-gen); gpt-5.1-codex (score 100, 400K ctx, thinking, code-gen); gemini-2.5-pro (score 100, 1.0M ctx, thinking, code-gen); gemini-3-pro-preview (score 100, 1.0M ctx, thinking, code-gen); gpt-5.2-pro (score 100, 400K ctx, thinking, code-gen); +26 more via `listmodels`. required

temperature 0 = deterministic · 1 = creative.

thinking_mode Reasoning depth: minimal, low, medium, high, or max.

continuation_id Unique thread continuation ID for multi-turn conversations. Works across different tools. ALWAYS reuse the last continuation_id you were given—this preserves full conversation context, files, and findings so the agent can resume seamlessly.

clink Link a request to an external AI CLI (Gemini CLI, Qwen CLI, etc.) through PAL MCP to reuse their capabilities inside existing workflows.

Parameters

prompt User request forwarded to the CLI (conversation context is pre-applied). required

cli_name Configured CLI client name (from conf/cli_clients). Available: claude, codex, gemini required

role Optional role preset defined for the selected CLI (defaults to 'default'). Roles per CLI: claude: codereviewer, default, planner; codex: codereviewer, default, planner; gemini: codereviewer, default, planner

absolute_file_paths Full paths to relevant code

images Optional absolute image paths or base64 blobs for visual context.

thinkdeep Performs multi-stage investigation and reasoning for complex problem analysis. Use for architecture decisions, complex bugs, performance challenges, and security analysis. Provides systematic hypothesis testing, evidence-based investigation, and expert validation.

Parameters

step Current work step content and findings from your overall work required

step_number Current step number in work sequence (starts at 1) required

total_steps Estimated total steps needed to complete work required

next_step_required Whether another work step is needed. When false, aim to reduce total_steps to match step_number to avoid mismatch. required

findings Important findings, evidence and insights discovered in this step required

files_checked List of files examined during this work step

relevant_files Files identified as relevant to issue/goal (FULL absolute paths to real files/folders - DO NOT SHORTEN)

relevant_context Methods/functions identified as involved in the issue

issues_found Issues identified with severity levels during work

confidence Confidence level: exploring (just starting), low (early investigation), medium (some evidence), high (strong evidence), very_high (comprehensive understanding), almost_certain (near complete confidence), certain (100% confidence locally - no external validation needed)

hypothesis Current theory about issue/goal based on work

use_assistant_model Use assistant model for expert analysis after workflow steps. False skips expert analysis, relies solely on your personal investigation. Defaults to True for comprehensive validation.

temperature 0 = deterministic · 1 = creative.

thinking_mode Reasoning depth: minimal, low, medium, high, or max.

images Optional absolute image paths or base64 blobs for visual context.

problem_context Additional context about problem/goal. Be expressive.

focus_areas Focus aspects (architecture, performance, security, etc.)

planner Breaks down complex tasks through interactive, sequential planning with revision and branching capabilities. Use for complex project planning, system design, migration strategies, and architectural decisions. Builds plans incrementally with deep reflection for complex scenarios.

Parameters

step Planning content for this step. Step 1: describe the task, problem and scope. Later steps: capture updates, revisions, branches, or open questions that shape the plan. required

step_number Current step number in work sequence (starts at 1) required

total_steps Estimated total steps needed to complete work required

next_step_required Whether another work step is needed. When false, aim to reduce total_steps to match step_number to avoid mismatch. required

is_step_revision Set true when you are replacing a previously recorded step.

revises_step_number Step number being replaced when revising.

is_branch_point True when this step creates a new branch to explore an alternative path.

branch_from_step If branching, the step number that this branch starts from.

branch_id Name for this branch (e.g. 'approach-A', 'migration-path').

more_steps_needed True when you now expect to add additional steps beyond the prior estimate.

consensus Builds multi-model consensus through systematic analysis and structured debate. Use for complex decisions, architectural choices, feature proposals, and technology evaluations. Consults multiple models with different stances to synthesize comprehensive recommendations.

Parameters

step Consensus prompt. Step 1: write the exact proposal/question every model will see (use 'Evaluate…', not meta commentary). Steps 2+: capture internal notes about the latest model response—these notes are NOT sent to other models. required

step_number Current step index (starts at 1). Step 1 is your analysis; steps 2+ handle each model response. required

total_steps Total steps = number of models consulted plus the final synthesis step. required

next_step_required True if more model consultations remain; set false when ready to synthesize. required

findings Step 1: your independent analysis for later synthesis (not shared with other models). Steps 2+: summarize the newest model response. required

relevant_files Optional supporting files that help the consensus analysis. Must be absolute full, non-abbreviated paths.

images Optional absolute image paths or base64 references that add helpful visual context.

models User-specified roster of models to consult (provide at least two entries). User-specified list of models to consult (provide at least two entries). Each entry may include model, stance (for/against/neutral), and stance_prompt. Each (model, stance) pair must be unique, e.g. [{'model':'gpt5','stance':'for'}, {'model':'pro','stance':'against'}]. When the user names a model, you MUST use that exact value or report the provider error—never swap in another option. Use the `listmodels` tool for the full roster. Top models: gpt-5.2 (score 100, 400K ctx, thinking, code-gen); gpt-5.1-codex (score 100, 400K ctx, thinking, code-gen); gemini-2.5-pro (score 100, 1.0M ctx, thinking, code-gen); gemini-3-pro-preview (score 100, 1.0M ctx, thinking, code-gen); gpt-5.2-pro (score 100, 400K ctx, thinking, code-gen); +26 more via `listmodels`.

current_model_index 0-based index of the next model to consult (managed internally).

model_responses Internal log of responses gathered so far.

codereview Performs systematic, step-by-step code review with expert validation. Use for comprehensive analysis covering quality, security, performance, and architecture. Guides through structured investigation to ensure thoroughness.

Parameters

step Review narrative. Step 1: outline the review strategy. Later steps: report findings. MUST cover quality, security, performance, and architecture. Reference code via `relevant_files`; avoid dumping large snippets. required

step_number Current review step (starts at 1) – each step should build on the last. required

total_steps Number of review steps planned. External validation: two steps (analysis + summary). Internal validation: one step. Use the same limits when continuing an existing review via continuation_id. required

next_step_required True when another review step follows. External validation: step 1 → True, step 2 → False. Internal validation: set False immediately. Apply the same rule on continuation flows. required

findings Capture findings (positive and negative) across quality, security, performance, and architecture; update each step. required

files_checked Absolute paths of every file reviewed, including those ruled out.

relevant_files Step 1: list all files/dirs under review. Must be absolute full non-abbreviated paths. Final step: narrow to files tied to key findings.

relevant_context Methods/functions identified as involved in the issue

issues_found Issues with severity (critical/high/medium/low) and descriptions.

hypothesis Current theory about issue/goal based on work

temperature 0 = deterministic · 1 = creative.

thinking_mode Reasoning depth: minimal, low, medium, high, or max.

images Optional diagram or screenshot paths that clarify review context.

review_validation_type Set 'external' (default) for expert follow-up or 'internal' for local-only review.

review_type Review focus: full, security, performance, or quick.

focus_on Optional note on areas to emphasise (e.g. 'threading', 'auth flow').

standards Coding standards or style guides to enforce.

severity_filter Lowest severity to include when reporting issues (critical/high/medium/low/all).

precommit Validates git changes and repository state before committing with systematic analysis. Use for multi-repository validation, security review, change impact assessment, and completeness verification. Guides through structured investigation with expert analysis.

Parameters

step Step 1: outline how you'll validate the git changes. Later steps: report findings. Review diffs and impacts, use `relevant_files`, and avoid pasting large snippets. required

step_number Current pre-commit step number (starts at 1). required

total_steps Planned number of validation steps. External validation: use at most three (analysis → follow-ups → summary). Internal validation: a single step. Honour these limits when resuming via continuation_id. required

next_step_required True to continue with another step, False when validation is complete. CRITICAL: If total_steps>=3 or when `precommit_type = external`, set to True until the final step. When continuation_id is provided: Follow the same validation rules based on precommit_type. required

findings Record git diff insights, risks, missing tests, security concerns, and positives; update previous notes as you go. required

files_checked Absolute paths for every file examined, including ruled-out candidates.

relevant_files Absolute paths of files involved in the change or validation (code, configs, tests, docs). Must be absolute full non-abbreviated paths.

relevant_context Methods/functions identified as involved in the issue

issues_found List issues with severity (critical/high/medium/low) plus descriptions (bugs, security, performance, coverage).

hypothesis Current theory about issue/goal based on work

temperature 0 = deterministic · 1 = creative.

thinking_mode Reasoning depth: minimal, low, medium, high, or max.

images Optional absolute paths to screenshots or diagrams that aid validation.

precommit_type 'external' (default, triggers expert model) or 'internal' (local-only validation).

path Absolute path to the repository root. Required in step 1.

compare_to Optional git ref (branch/tag/commit) to diff against; falls back to staged/unstaged changes.

include_staged Whether to inspect staged changes (ignored when `compare_to` is set).

include_unstaged Whether to inspect unstaged changes (ignored when `compare_to` is set).

focus_on Optional emphasis areas such as security, performance, or test coverage.

severity_filter Lowest severity to include when reporting issues.

debug Performs systematic debugging and root cause analysis for any type of issue. Use for complex bugs, mysterious errors, performance issues, race conditions, memory leaks, and integration problems. Guides through structured investigation with hypothesis testing and expert analysis.

Parameters

step Investigation step. Step 1: State issue+direction. Symptoms misleading; 'no bug' valid. Trace dependencies, verify hypotheses. Use relevant_files for code; this for text only. required

step_number Current step index (starts at 1). Build upon previous steps. required

total_steps Estimated total steps needed to complete the investigation. Adjust as new findings emerge. IMPORTANT: When continuation_id is provided (continuing a previous conversation), set this to 1 as we're not starting a new multi-step investigation. required

next_step_required True if you plan to continue the investigation with another step. False means root cause is known or investigation is complete. IMPORTANT: When continuation_id is provided (continuing a previous conversation), set this to False to immediately proceed with expert analysis. required

findings Discoveries: clues, code/log evidence, disproven theories. Be specific. If no bug found, document clearly as valid. required

files_checked All examined files (absolute paths), including ruled-out ones.

relevant_files Files directly relevant to issue (absolute paths). Cause, trigger, or manifestation locations.

relevant_context Methods/functions identified as involved in the issue

issues_found Issues identified with severity levels during work

confidence Your confidence in the hypothesis: exploring (starting out), low (early idea), medium (some evidence), high (strong evidence), very_high (very strong evidence), almost_certain (nearly confirmed), certain (100% confidence - root cause and fix are both confirmed locally with no need for external validation). WARNING: Do NOT use 'certain' unless the issue can be fully resolved with a fix, use 'very_high' or 'almost_certain' instead when not 100% sure. Using 'certain' means you have ABSOLUTE confidence locally and PREVENTS external model validation.

hypothesis Concrete root cause theory from evidence. Can revise. Valid: 'No bug found - user misunderstanding' or 'Symptoms unrelated to code' if supported.

temperature 0 = deterministic · 1 = creative.

thinking_mode Reasoning depth: minimal, low, medium, high, or max.

images Optional screenshots/visuals clarifying issue (absolute paths).

secaudit Performs comprehensive security audit with systematic vulnerability assessment. Use for OWASP Top 10 analysis, compliance evaluation, threat modeling, and security architecture review. Guides through structured security investigation with expert validation.

Parameters

step Step 1: outline the audit strategy (OWASP Top 10, auth, validation, etc.). Later steps: report findings. MANDATORY: use `relevant_files` for code references and avoid large snippets. required

step_number Current security-audit step number (starts at 1). required

total_steps Expected number of audit steps; adjust as new risks surface. required

next_step_required True while additional threat analysis remains; set False once you are ready to hand off for validation. required

findings Summarize vulnerabilities, auth issues, validation gaps, compliance notes, and positives; update prior findings as needed. required

files_checked Absolute paths for every file inspected, including rejected candidates.

relevant_files Absolute paths for security-relevant files (auth modules, configs, sensitive code).

relevant_context Methods/functions identified as involved in the issue

issues_found Security issues with severity (critical/high/medium/low) and descriptions (vulns, auth flaws, injection, crypto, config).

confidence exploring/low/medium/high/very_high/almost_certain/certain. 'certain' blocks external validation—use only when fully complete.

hypothesis Current theory about issue/goal based on work

temperature 0 = deterministic · 1 = creative.

thinking_mode Reasoning depth: minimal, low, medium, high, or max.

images Optional absolute paths to diagrams or threat models that inform the audit.

security_scope Security context (web, mobile, API, cloud, etc.) including stack, user types, data sensitivity, and threat landscape.

threat_level Assess the threat level: low (internal/low-risk), medium (customer-facing/business data), high (regulated or sensitive), critical (financial/healthcare/PII).

compliance_requirements Applicable compliance frameworks or standards (SOC2, PCI DSS, HIPAA, GDPR, ISO 27001, NIST, etc.).

audit_focus Primary focus area: owasp, compliance, infrastructure, dependencies, or comprehensive.

severity_filter Minimum severity to include when reporting security issues.

docgen Generates comprehensive code documentation with systematic analysis of functions, classes, and complexity. Use for documentation generation, code analysis, complexity assessment, and API documentation. Analyzes code structure and patterns to create thorough documentation.