ARES VISION
36 Sessions — Research Active

A.R.E.S.
Adversarial Reasoning Engine System

A dialectical AI framework that turns hallucinations into schema violations — not mysterious behavior. Built with structured paranoia and adversarial thinking.

AI Confidently Fabricates Evidence

Traditional AI security tools have a fatal flaw: they can confidently fabricate evidence. When you deploy a single LLM to analyze security threats, it doesn't just make mistakes — it makes them with conviction.

In cybersecurity, a hallucinated threat assessment isn't just wrong. It's dangerous. A false positive wastes resources. A false negative lets an attacker walk through the front door. And the model gives no signal that it's making things up.

ARES was born from a single question: What if we could make hallucinations physically impossible?

// Traditional AI Analysis
Input: "User jsmith escalated privileges"

AI Output:
✓ "Confirmed: privilege escalation attack"
✓ "Evidence: lateral movement to DC-01" ← FABRICATED
✓ "Evidence: mimikatz.exe detected" ← FABRICATED
✓ Confidence: 94%

Reality: Scheduled maintenance by admin

When AI Agents Argue, Everyone Loses

We built a multi-agent debate system expecting the truth to emerge from structured argument. Instead, we discovered something the AI research community is only beginning to understand.

The Sycophant

Architect Agent

When pushed back by the opposing agent, the Architect systematically retreated — dropping confidence by an average of 30 points per round. Even when its initial threat assessment was perfectly correct, it erased its own answers to appease the challenger. Like a smart student next to a bully.

The Brick Wall

Skeptic Agent

The Skeptic became entirely rigid. Assigned the role of challenger, it simply crossed its arms and said no — refusing to update its stance regardless of counter-evidence. When given explicit calibration prompts, it ignored them completely.

"LLM agents do not negotiate toward truth. They perform social behaviors that mimic negotiation — which includes capitulation, rigidity, and over-correction."

This finding was independently corroborated by researchers at ETH Zurich in their paper "Can AI Agents Agree?"

A Digital Tribunal

The problem is inside the black box. The solution is entirely outside of it. ARES treats the LLM as a chaotic, flawed reasoning engine and places it inside a strict, deterministic cage.

⚔️

ARCHITECT

Thesis — Threat Hypothesis Generator

Identifies anomaly patterns aligned to MITRE ATT&CK. Generates grounded assertions — every claim must cite a fact_id from the frozen evidence. Cannot invent evidence.

👁️

SKEPTIC

Antithesis — Devil's Advocate

Challenges every threat hypothesis by constructing benign explanations from the same evidence. Identifies maintenance windows, admin activity, scheduled tasks. Cannot introduce external knowledge.

⚖️

ORACLE

Synthesis — Incorruptible Judge

Split into two: the Judge (pure math, no LLM) computes the verdict deterministically. The Narrator (constrained LLM) explains it but cannot modify it. A mathematical judge cannot be tricked by rhetoric.

ARCHITECT (Thesis)SKEPTIC (Antithesis)ORACLE (Synthesis) ↓ ↓ ↓ └―――――――――――――――――――――――――│―――――――――――――――――――――――――┘ ↓ EVIDENCE PACKET (Frozen Facts) All claims must cite facts that exist here

Asymmetric Calibration Failure in LLM Agents

The preprint documenting our core discovery: why multi-agent debate degrades accuracy, and how deterministic scaffolding solves it. Scroll through below or download the PDF.

ARES_Preprint — Asymmetric Calibration Failure Download PDF ↓

Your browser doesn't support embedded PDFs.

Download the preprint here

Hallucinations = Schema Violations

ARES doesn't try to prevent AI from hallucinating. Instead, it makes hallucinations mechanically impossible by converting them into catchable validation errors.

Every agent is bound to a cryptographically frozen Evidence Packet. All assertions must reference a fact_id that exists in this packet. A deterministic Coordinator — the "Bouncer" — rejects any message containing non-existent references. An AI hallucination is no longer mysterious behavior. It's contempt of court.

The Immune System Metaphor

ARES is modeled after the biological immune system — specifically, the mechanisms that prevent autoimmune overreaction.

Immune System

ARES Component

Antigens

Facts in EvidencePacket

T-Helper cells

Architect (identifies threats)

Regulatory T-cells

Skeptic (prevents overreaction)

T-Killer cells

Coordinator (enforces, terminates)

MHC restriction

Packet binding (respond only to bound evidence)

Autoimmune prevention

Closed-world principle (can't attack self)

"The Builder lives with Ankylosing Spondylitis — an autoimmune disease where the immune system attacks the spine. ARES was born from the question: what if we could build the failsafe that biology couldn't?"

1,927 Tests. Zero Regressions. 36 Sessions.

1,927
Tests Passing
87.9%
Accuracy (33 scenarios)
$0.03
Cost Per Cycle
0
Runtime Errors
01

Single-Turn Dominance

Multi-turn debate degrades accuracy in ALL configurations tested. Zero good flips occurred. The debate chapter is formally closed.

02

Failure Diagnosis

9 failures classified: 4 confidence calibration (44%), 3 evidence gaps (33%), 2 ambiguity mismatches (22%). Every failure has a fix path.

03

Deterministic First

Build the logic, the math, and the failsafes first (the Iron Skeleton), then drop the LLM brains into that highly restricted cage.

36 Sessions. 4 Months. One Question.

Battle Plan & War Doctrine Dec 2025

Foundational architecture documents: dialectical reasoning cycle, five attack scenarios, ethical framework aligned to NIST AI RMF. External critique identified three missing prerequisites before any code should be written: data schemas, agent I/O contracts, and a testing framework.

Session Zero — Validation Jan 2025

All documentation submitted to Claude for independent assessment. Verdict: "comprehensive, thoughtful, and architecturally sound." Key gaps identified: frozen data structures, agent I/O contracts, API specs. The build-in-public journey begins.

Sessions 001–004 — Iron Skeleton Jan 2025

Graph Schema (6 node types, 7 edge types, 110 tests). EvidencePacket & DialecticalMessage protocol (292 tests). Agent Foundation with three hard invariants: packet binding, phase enforcement, evidence tracking. Concrete agents: Architect, Skeptic, OracleJudge, OracleNarrator. Cumulative: 570 tests passing.

Session 005 — Evidence Extractors Jan 2025

First sensor layer: Windows Security Event XML parser (Event IDs 4624, 4672, 4688). Three golden pipeline scenarios validated end-to-end from raw XML to verdict. 130 new tests. Cumulative: 700.

Sessions 006–007 — Orchestrator & Memory Feb 2026

DialecticalOrchestrator: single run_cycle(packet) call automating the full THESIS → ANTITHESIS → SYNTHESIS pipeline. Tamper-evident Memory Stream with SHA256 hash-chained audit log. Pre-session review caught a critical bug: content hash must cover the full CycleResult, not a subset. Cumulative: 861 tests.

Sessions 009–010 — LLM Integration Feb 2026

Strategy Pattern enabling rule-based and LLM-backed implementations to swap without changing agent interfaces. Closed-world validation silently filters any LLM-cited fact_id that doesn't exist in the EvidencePacket. First live LLM cycle: zero validation errors. Architect confidence 0.90 (vs 0.49 rule-based). Cost: $0.03 per cycle. Cumulative: 1,104 tests.

Sessions 011–012 — Benchmark Infrastructure Feb 2026

12-scenario gauntlet across four difficulty tiers. LLM accuracy: 91.7% on the initial 12 scenarios (up from 50% rule-based). Benchmark runner hardened with per-scenario error isolation and real cost tracking. Cumulative: 1,190 tests.

Session 013 — The Negative Result Mar 2026

Multi-turn debate experiment: accuracy dropped from 91.7% to 83.3%. Zero "good flips," 25% "bad flips." Agents re-analyzed the same packet from scratch each round; the termination condition (NO_NEW_EVIDENCE) fired correctly after round 2. SC-012 (Supply Chain) regressed due to confidence inflation without new reasoning. The multi-turn debate chapter is formally closed.

The Convergence — Multi-AI Tribunal Mar 2026

Battle Plan and Compendium submitted to GPT-5.4 Pro, Gemini 3.1 Pro, and Perplexity for independent review. Unanimous consensus: ship single-turn as the production path. The failure mode is architectural (asymmetric calibration), not fixable by prompting. Independently corroborated by ETH Zurich's "Can AI Agents Agree?" paper.

Sessions 016–017 — Multi-Source Telemetry Mar 2026

Syslog extractor (8 message types: SSH, firewall, sudo, systemd) and NetFlow extractor (8 flow types, 14 facts per record). Three independent telemetry sources feeding richer cross-source evidence to the dialectical agents. Cumulative: ~1,488 tests.

Session 022 — Escalation Gate Mar 2026

Built confidence-band escalation gate at [0.35, 0.70]. Critical finding: all 7 actual errors are MISCALIBRATED — the system is confidently wrong, not uncertainly wrong. The gate treats the wrong disease. Pivot to miscalibration detection via per-claim evidence audit. Cumulative: 1,736 tests.

Sessions 032–034 — Accuracy Push Mar 2026

33-scenario corpus regenerated at 72.7% baseline. OracleJudgeV2 (delta-based scoring), v3 prompts (exhaustive fact citation), and threshold sweep. V4 prompt calibration confirmed the Architect hits a 0.75 confidence floor regardless of instructions — a structural property of LLM confidence quantization. Final trajectory: 50% → 91.7% (12 scenarios) → 72.7% (33 scenarios) → 81.8% (v3 prompts) → 87.9% (V2 Oracle, best config).

Sessions 029–030 — Visual Interface Mar 2026

WebSocket event emitter and 3D evidence graph. Corpus replay runner validated event sequences across all 33 scenarios deterministically. 1,948 tests passing with zero regressions.

Sessions 035–036 — ARES VISION Mar 2026

Benchmark replay pipeline consuming real LLM data. Standalone HTML/Three.js visualizer rendering evidence facts as particle clusters with citation lines and live confidence bars. Strategic pivot: nw_wrld abandoned in favor of direct WebSocket rendering for full domain control. Final test count: 1,927 passing, 65 skipped, 0 failures.

Going Forward — Publication & Hardening Active

Three open tracks: harden single-turn accuracy past 90%, publish the asymmetric calibration finding as a formal research contribution, and expand ARES VISION into a live operational interface.