consult

Structured Peer Review.
Consensus-Based Analysis.

Multiple AI agents analyze your problem independently,
critique each other's solutions, and iterate until
they reach consensus—or an orchestrator resolves.

The value isn't individual outputs.
It's the meta-cognition: agents reviewing agents.

The Problem

Architecture decisions fail at integration points. A single perspective—human or LLM—optimizes for one dimension while missing others.

Schema normalized for queries, but connection pool exhaustion under load

Microservices boundary clean, but distributed transaction hell

Auth flow secure, but latency budget blown on token validation

Cache invalidation "solved" until eventual consistency bites

Single-agent LLMs echo back your framing. Peer review surfaces the friction points.

How It Works

CONSENSUS WORKFLOW
1
PARALLEL ANALYSIS
Database
Agent
Backend
Agent
Infra
Agent
Domain-specific
system prompts
2
PEER REVIEW
Each agent critiques the others' solutions
"Would I sign off on THIS for production?"
3
META REVIEW
Cross-cutting analysis: integration issues, gaps, conflicts
4
REVISION
Agents incorporate feedback, revise their solutions
5
APPROVAL VOTE
Each agent: "Would I sign off on THEIR solution?"
≥80% approval → consensus reached
<80% approval → iterate or orchestrator resolves

Why This Works

The skeptical question: "Why not just prompt a single LLM to consider multiple perspectives?"

Separation of concerns

Each agent has a focused system prompt. A database agent isn't trying to also think about security—it reviews the security agent's work instead. Specialization without context pollution.

Explicit disagreement surfaces

When Agent A objects to Agent B's solution, you see the specific critique. Single-LLM "multi-perspective" prompts tend to smooth over conflicts. Peer review makes friction visible.

Iteration with feedback loops

Agents revise based on peer critique, not just their own re-reading. The revision incorporates external signal, not just self-consistency checks.

Quantified confidence

73% consensus is information. "I think this is good" isn't. The approval scores tell you where disagreement lives—and that's often where bugs hide.

The Math — Transparent

Not "how similar are outputs?" but "would each agent approve the others' work?"

# 3 agents = 6 pairwise reviews

Database → Backend:  APPROVE  (1.0)
Database → Infra:    CONCERNS (0.7)
Backend  → Database: APPROVE  (1.0)
Backend  → Infra:    OBJECT   (0.0)
Infra    → Database: CONCERNS (0.7)
Infra    → Backend:  APPROVE  (1.0)

─────────────────────────────────
Aggregate: (1.0 + 0.7 + 1.0 + 0.0 + 0.7 + 1.0) / 6 = 73%

Threshold: 80%
Result: Iterate or orchestrator resolves
APPROVE (1.0) Production-ready
~ CONCERNS (0.7) Acceptable with noted issues
OBJECT (0.0) Fundamental problems

How scores are derived

1
5 weighted dimensions Requirements 30% · Approach 25% · Trade-offs 20% · Architecture 15% · Implementation 10%
2
Categorical verdict per dimension APPROVE → 1.0 · CONCERNS → 0.7 · OBJECT → 0.0
3
Weighted sum = final approval Structured output, deterministic aggregation.

Capabilities

Beyond the core workflow—features that make Consult practical for real engineering work.

Team Mode: Cross-Provider Competition

Run the same query across Anthropic, OpenAI, and Google simultaneously. Agents from different providers critique each other's solutions. Claude reviews GPT's architecture. Gemini challenges Claude's assumptions. Disagreement across model families surfaces blind spots that single-provider analysis misses.

consult -p "..." -m team
💬

Smart Clarification

Before burning tokens on full analysis, a lightweight pre-flight detects ambiguous queries. Asks only high-impact questions: scope boundaries, constraints, success criteria. Skips clarification for follow-ups clearly scoped by prior context. Explains why each question matters.

Automatic—triggers when ambiguity detected
🔄

Multi-Turn Sessions

Follow-up queries preserve full context. "Now add rate limiting to that design"—without re-explaining your schema, constraints, or prior decisions. Session state persisted to ~/.consult/sessions/. Resume conversations across terminal sessions.

Just keep typing in TUI, or use session flags in CLI
📎

Attachment Intelligence

Drop in your schema.sql, architecture diagrams, error logs. PDFs automatically converted to images for providers lacking native support. Provider-specific size limits enforced gracefully. Conversion cached—no redundant processing across workflow phases.

F key in TUI, or --attach in CLI
🧠

Memory Compaction

Long conversations get AI-summarized when context window fills. Preserves: original question, final solution, key insights and constraints. Discards: intermediate back-and-forth, superseded ideas. Inspired by Claude Code's context management strategy.

C key in TUI, or automatic when threshold exceeded
🎛️

Model Flexibility

Cost-optimized defaults: Haiku, GPT-4o-mini, Gemini Flash for expert agents. SOTA model (Opus) reserved for meta-review synthesis only. Override any model via environment variables. Switch providers mid-session in TUI.

ANTHROPIC_MODEL=claude-sonnet-4-20250514 consult ...

CLI Usage

Prefer scripts and pipelines? The CLI does the same workflow without the interface.

Event sourcing trade-offs database, backend, architect
$ consult -p "Order service: event sourcing vs state-based. 100k orders/day, \
  need audit trail, eventual consistency for reads, strong for inventory" \
  -e "database_expert,backend_expert,software_architect"
Multi-tenant auth architecture security_focused preset, 3 iterations
$ consult -p "SaaS auth: OAuth2 + SAML, tenant isolation, session management \
  across subdomains, SOC2 compliance. Pain point: JWT token bloat" \
  -e security_focused -i 3
Cache invalidation strategy database, backend, performance
$ consult -p "Product cache: 50k SKUs, prices from ERP every 15min, inventory \
  real-time. Redis 5min TTL showing stale prices. Avoid cache stampede" \
  -e "database_expert,backend_expert,performance_expert"
Zero-downtime migration architecture preset, 90% threshold
$ consult -p "Migrate user table UUID→ULID: 200M rows, 50+ FK references, \
  zero-downtime. Aurora PostgreSQL. Evaluate dual-write vs shadow vs CDC" \
  -e architecture -t 0.9

Terminal Interface

Not a web dashboard. A proper terminal UI with collapsible workflow visualization, live consensus tracking, and keyboard-driven navigation.

CONFIGURATION
MODEL
claude-sonnet
Single AI team
ANALYSIS
2 thinking rounds
80% agreement needed
3 agents
MEMORY
23% used
3 prior messages
FILES
schema.sql
q=Quit d=Detail
l=Log y=Copy
CONSULT — Many Agents, One Answer
PHASE 1: Initial Analysis [COMPLETE]
database_agent done
"Recommending event sourcing with CQRS..."
backend_agent done
"Async message queue between services..."
infrastructure_agent done
"K8s with horizontal pod autoscaling..."
PHASE 2: Peer Review [IN PROGRESS]
database → backend: APPROVE
database → infra: CONCERNS "connection pooling..."
backend → database: APPROVE
backend → infra: reviewing...
PHASE 3: Meta Review [PENDING]
PHASE 4: Revision [PENDING]
PHASE 5: Final Consensus [PENDING]
ACTIVITY LOG
14:23:01 backend_agent reviewing infra_agent...
14:23:03 Pairwise: 1.0 0.7 1.0
14:23:05 Aggregate: 73% (need 80%)
Enter query... [Send]
D Expand any agent card to see full response + peer feedback received
L Toggle activity log — real-time consensus math as it computes
ESC Collapse all phases — see just the status overview
F Attach files — schema.sql, architecture.md, error logs
Y Copy final consensus to clipboard
C Compact memory — summarize conversation, free context window

What Consult is NOT

Not Cursor, Windsurf, or Claude Code

Those are code editors—they write code in your IDE. Consult is for architecture decisions before you write code. Use Consult to decide what to build, then your editor to build it. Complementary, not competitive.

Not a replacement for domain expertise

You still need to understand your problem. Consult enhances analysis, it doesn't replace your judgment.

Not magic

It's LLMs with structure. Better than raw ChatGPT for architecture decisions, but still LLM-powered with LLM limitations.

Not a black box

Full reasoning chains visible. Every approval/objection shows rationale. You see exactly why agents agreed or disagreed.

Pricing

Not a chat wrapper. A structured peer review workflow that surfaces disagreement between perspectives.
BYOK model — you bring your own API keys, pay providers directly. We don't upcharge.

Free

$0/month

  • 2 agents max
  • 3 queries/day
  • 1 iteration
  • CLI only
  • essentials set only
Get Started

Pro Annual

$90 USD/year

Save $18 — 2 months free

  • Everything in Pro Monthly
  • Billed annually
  • $7.50/month effective

Security Model

Your API keys and data never leave your machine. Here's exactly how.

API keys stored locally, never transmitted

Keys read from ~/.consult/.env or environment variables. Loaded into memory at runtime, never written to logs, never sent over network except directly to the provider (Anthropic/OpenAI/Google) via HTTPS.

Verify: tcpdump or mitmproxy shows only outbound connections to api.anthropic.com, api.openai.com, generativelanguage.googleapis.com.

No telemetry, no analytics, no phone-home

Zero outbound connections to our servers. License validation is cryptographic signature verification performed locally—no network call required. We don't know who's using Consult, how often, or for what.

Verify: Block all outbound traffic except LLM providers. Consult continues to work.

Prompts and responses stay on disk you control

Session history stored in ~/.consult/sessions/. Logs in ~/.consult/logs/. Both are local filesystem—encrypt at rest with FileVault/LUKS if your policy requires. No cloud sync, no external backup.

Verify: ls -la ~/.consult/ shows all persisted data. Delete anytime.

Log redaction for sensitive content

API keys pattern-matched and redacted in logs (sk-ant-***, sk-proj-***). Full prompt content logged only at DEBUG level (disabled by default). Production logging shows workflow events, not payload content.

Verify: grep -r "sk-" ~/.consult/logs/ returns zero matches.

Your data goes to LLM providers—that's the trade-off

We can't prevent Anthropic/OpenAI/Google from seeing your prompts—that's how LLMs work. If your compliance requires on-prem inference, Consult isn't the right tool. We're transparent about this boundary.

Mitigation: Use providers with data retention opt-outs. Anthropic API has zero-retention by default.

Agent Types

Each agent has domain-specific prompts that shape analysis. Peer review catches what individual perspectives miss.

database_expert Data modeling, queries, consistency, migrations
backend_expert API design, service boundaries, error handling
infrastructure_expert Deployment, scaling, monitoring, reliability
security_expert Threat models, auth, input validation, compliance
performance_expert Bottlenecks, caching, profiling, optimization
software_architect System design, trade-offs, patterns
cloud_engineer Cloud services, IaC, containers, DevOps
frontend_expert UI architecture, state management, rendering
ml_expert ML systems, training, inference, MLOps
data_expert Pipelines, ETL, streaming, warehousing
ux_expert User research, interaction design, accessibility

Predefined Sets

Curated combinations for common scenarios. Use -e set_name in CLI.

essentials Free
Quick sanity check for frontend/backend alignment
backend frontend
default Pro
Core infrastructure triad for most backend decisions
database backend infrastructure
security_focused Pro
Auth, compliance, threat modeling emphasis
security backend infrastructure
architecture Pro
System design, trade-offs, cloud-native patterns
architect database cloud
full_stack Pro
End-to-end coverage for feature development
backend frontend database infra

Installation

# Install
$ pip install getconsult

# Configure API key (at least one required)
$ mkdir -p ~/.consult
$ echo 'ANTHROPIC_API_KEY=sk-ant-...' > ~/.consult/.env
$ chmod 600 ~/.consult/.env

# Or export directly
$ export ANTHROPIC_API_KEY=sk-ant-...

# Verify setup
$ consult --status

# Run your first query
$ consult -p "Design a user authentication system" -e essentials

Supports Anthropic, OpenAI, and Google models.
Default uses cost-optimized models (Haiku, GPT-4o-mini, Gemini Flash).