Actively finding real vulnerabilities

Find the bugs
that actually pay.

Import traffic and Arbiter turns it into a typed state graph. It hunts constraint violations like IDOR, XSS, SQLi, smuggling, and race conditions, then verifies exploitability in Chrome and exports a submission-ready report.

Used to discover & responsibly disclose vulnerabilities in Anthropic & Cloudflare open source projects

100% detection rate on Google's Firing Range (85/85 endpoints)

Built in Rust from scratch — every finding verified in a real browser

arbiter — session: security-audit

Proof, Not Promises

Real vulnerabilities. Responsibly disclosed.

Arbiter has been used to discover security issues in open source projects from Anthropic and Cloudflare. All findings were ethically disclosed via responsible disclosure. No fabricated testimonials — just results.

Disclosure • 2026

Anthropic Open Source

Vulnerability discovered in Anthropic's open source tooling — the company behind Claude. Responsibly disclosed and acknowledged by their security team.

Responsibly Disclosed

Disclosure • 2026

Cloudflare Open Source

Security issue identified in Cloudflare's open source infrastructure tooling, used by millions of websites globally. Ethically reported via their responsible disclosure program.

Responsibly Disclosed

Benchmark

Google Firing Range

100% detection rate across all 85 test endpoints in Google's XSS Firing Range — the industry standard benchmark for vulnerability detection accuracy.

85/85 Verified

How It Works

From traffic capture to verified bounty report

Traditional scanners enumerate every possible request and filter false positives. Arbiter infers what the application allows, models it as a typed state graph, and searches only for constraint violations. Exponentially smaller search space.

Step 01

Capture

Import traffic from HAR files, Burp XML exports, or Arbiter's built-in proxy. Handles authenticated sessions and TLS.

Step 02

Analyze

Build a typed state graph from observed traffic. Infer authorization boundaries, session constraints, ordering dependencies, rate limits, and CSRF flows. This is the constraint model that makes targeted detection possible.

Step 03

Detect & Verify

Search across 52 vulnerability classes for constraint violations — not just pattern matches. Every finding is verified in headless Chrome with screenshot evidence.

Step 04

Report

Generate submission-ready reports for HackerOne, Bugcrowd, or Intigriti. Includes PoC code, evidence screenshots, and CVSS scoring.

Inside the Engine

From traffic to verified finding

Arbiter ingests captured exchanges, builds a typed state graph (EndpointSignature, AuthBoundary, CsrfFlow, ParamValueFlow), infers ordering and resource constraints, then verifies every candidate finding in headless Chrome via CDP. You only test what the app could actually allow — and you only file what the browser actually proved.

FIG. 01 Capture → Graph → Verified via Chrome DevTools Protocol

Anatomy of a Finding

What an Arbiter finding actually looks like

Concept matters less than what lands in your inbox. Below is the shape of a real disclosure Arbiter produces — redacted target, everything else verbatim. No pattern matching, no "potential" or "likely". A concrete request, a concrete browser response, and a one-line reproduction.

FINDING #f7e3a2c1

VERIFIED CWE-639 CVSS 8.8 / High

Cross-tenant repository access via predictable resource id

The /api/repos/{id}/secrets endpoint authenticates the caller's session but does not check whether the requested repo is owned by the caller's tenant. Any authenticated user can read credentials and webhooks from any other tenant's repository simply by guessing the integer id.

Discovery

Arbiter ingested 1,847 captured exchanges from a HAR file and built the state graph. The /api/repos/{id}/secrets endpoint appeared in 4 distinct user sessions. From those sessions Arbiter inferred the constraint:

session.tenant_id == path.repo.owner_tenant_id   // confidence 0.97 across 4 sessions

Proof of concept

Arbiter probed the constraint by issuing a request from one tenant's authenticated session for another tenant's repo:

$ curl -sS -b "session=$ATTACKER_SESSION" \
       https://target.example.com/api/repos/8/secrets
{
  "repo_id": 8,
  "owner_tenant": 4,                         // attacker is tenant 2
  "secrets": {
    "AWS_ACCESS_KEY_ID":    "AKIA ··· XQYV",
    "DATABASE_URL":         "postgres://···",
    "SLACK_WEBHOOK":        "https://hooks.slack.com/···"
  }
}

Browser verification

Headless Chrome navigated to the same URL with the attacker's session cookie. Every signal Arbiter records confirms the vulnerability:

HTTP 200 · response body contained credentials in plaintext
DOM snapshot includes a <pre> rendering the secrets
No CSP violations, no auth errors in the console
Screenshot captured: cross-tenant-secrets-render.png

Recommendation

Enforce row-level tenant scoping at every read endpoint that keys off a user-controlled resource id:

where repo.owner_tenant_id = current_session.tenant_id

Pattern matchers will not catch this — the request is syntactically identical to a legitimate one. Only constraint inference catches an authorization gap that has no payload.

Export

This finding is one click away from any of:

HackerOne submission · markdown with severity, repro, impact
Bugcrowd / Intigriti format · identical content, platform-specific frontmatter
SARIF 2.1.0 · machine-readable for CI/CD pipelines
Raw HTTP · full request & response capture for replay

Honest Comparison

Arbiter vs. the tools you already use

Burp is manual. Nuclei matches signatures. ZAP is noisy. Arbiter infers constraints and reasons about logic.

Side-by-side comparison of Arbiter against Burp Suite Pro, Nuclei, and OWASP ZAP across reasoning, detection, output, and foundation capabilities.
Capability	Arbiter	Burp Suite Pro	Nuclei	OWASP ZAP
Reasoning & analysis
Constraint inference from traffic	Yes	No	No	No
State graph construction	Yes	No	No	No
Static + dynamic correlation	narsil-mcp integration	No	No	No
Detection & verification
Real browser verification	Full Chrome + evidence	No	Headless only	Basic
Race condition exploitation	H/2 single-packet, <100µs	Manual via Turbo Intruder	No	No
WAF detection & bypass	Fingerprint + AI mutation	Extensions	Templates	No
Vulnerability detectors	52 vulnerability classes	Scanner + extensions	6,000+ templates	Active + passive
Output & integration
AI agent integration (MCP)	267 MCP tools + REST API	No	No	No
Bug bounty report generation	HackerOne, Bugcrowd, Intigriti	No	Basic markdown	No
Foundation
Performance	Rust, zero-GC	JVM-based	Go	JVM-based
Price	Free tier + Pro	$449/yr	Free (OSS)	Free (OSS)

Nuclei excels at known CVE scanning with its massive template library. Burp's manual testing workflow is mature. Arbiter's advantage is logic-aware discovery and verification — it's complementary, not a wholesale replacement for every use case.

What It Finds

Real vulnerabilities, verified in production

Not a wrapper around existing tools. Built from scratch in Rust with constraint inference, state graph construction, and browser verification.

267

MCP Tools

Full AI agent integration + REST API

Vuln Classes

XSS, IDOR, SQLi, SSRF, race conditions

100%

Firing Range

85/85 Google benchmark endpoints

<100µs

Race Precision

Single-packet H/2 timing attacks

False Positives

Every finding verified in headless Chrome

Bug Bounty Platforms

HackerOne, Bugcrowd, Intigriti exports

Capabilities

What Arbiter actually does

Each capability is built to find the vulnerability classes that pay the highest bounties.

Most Differentiated Capability

HTTP/2 Single-Packet Race Attacks

Implements James Kettle's "Smashing the State Machine" research from PortSwigger. All HTTP/2 requests are packed into a single TCP segment, achieving sub-100 microsecond timing precision. This eliminates network jitter and exposes race conditions that other tools miss entirely.

<100µs Timing precision

9 Race patterns

3 Execution modes

Detects double-spend, limit bypass, TOCTOU, and coupon abuse vulnerabilities. Automatic shared-resource analysis identifies race-prone endpoints.

AI-Native

267 MCP Tools + REST API

Arbiter is an MCP server and REST API. Connect it to Claude, and the AI can orchestrate full security assessments autonomously — from traffic import to verified report. This isn't a chatbot wrapper. It's 267 tools the AI reasons with.

stdio + SSE transport • REST API • Human-in-loop safety gates • Full audit logging

Browser Verification

Zero False Positives

Every XSS, every injection, every bypass is confirmed in a real headless Chrome instance before it reaches your report. DOM snapshots, console logs, network traces, and screenshots are captured as evidence.

CSP-aware • Full evidence chain • Exploitability confirmed, not guessed

WAF Bypass

Detect, Fingerprint, Bypass

Identifies WAF vendors (Cloudflare, Akamai, AWS WAF, etc.), profiles blocking behaviour, and generates bypass payloads. The Polychrome AI module uses SLM-powered mutation for novel evasion.

10+ WAF vendors • Encoding mutation • Lazarus crash detection

52 Vulnerability Classes

Every Major Vulnerability Class

XSS (reflected, stored, DOM), SQLi (error, blind, time-based across 5 database engines), IDOR, SSRF, CSRF, command injection, path traversal, CORS misconfiguration, JWT algorithm confusion, HTTP smuggling (CL.TE, TE.CL, CL.0, H2 desync), cache poisoning, XS-Leaks, OAuth/OIDC redirect manipulation, SSTI across 12 template engines, prompt injection detection, static analysis across 32 languages, and more.

Constraint-aware • Not pattern matching • Business logic • Protocol-level attacks

Report Generation

Submission-Ready in Minutes

Generates platform-specific reports for HackerOne, Bugcrowd, and Intigriti. Includes reproduction steps, PoC code (curl, Python, JavaScript, raw HTTP), CVSS scoring, and remediation guidance.

Bulk generation • Evidence auto-attached • Markdown + JSON export

Pricing

Start free. Scale when you're ready.

No credit card required. Upgrade as your targets grow.

Free

$0/mo

5 scans per month
Core detectors
Browser verification
HackerOne report export
Community support

Join Waitlist

Frequently asked questions

How is Arbiter different from running Burp Suite with an AI wrapper?

Arbiter isn't a wrapper. It's built from scratch in Rust with constraint inference, state graph construction, and browser verification. AI agents get 267 structured MCP tools to reason with — not a chatbot pasted over a GUI.

Can I use Arbiter without AI agents?

Yes. Arbiter has a full CLI and REST API. The MCP integration is an additional interface, not a requirement. You can run scans, generate reports, and use all detectors directly from the command line.

What's the false positive rate?

Every finding is verified in headless Chrome with screenshot evidence. If it's in your report, it's exploitable. This is the core difference from signature-based scanners that flag patterns without confirming exploitability.

Does Arbiter work behind a WAF?

Yes. Arbiter fingerprints 10+ WAF vendors (Cloudflare, Akamai, AWS WAF, etc.) and generates bypass payloads. The Polychrome module adds SLM-powered mutation for novel evasion.

Is my data safe?

Arbiter runs locally. No scan data leaves your machine unless you explicitly export it. Enterprise customers can deploy on-prem with full network isolation.

How do 267 tools scale without overwhelming the AI's context window?

Both Arbiter and Aletheia integrate with Forgemax, our open source MCP gateway. Instead of dumping all 267 tool schemas into the LLM's context (~53,000 tokens), Forgemax collapses them into just two tools: search and execute. The agent discovers capabilities dynamically and writes JavaScript to chain tool calls inside a sandboxed V8 isolate. Total context overhead: ~1,100 tokens regardless of how many tools exist — a 98%+ reduction. The sandbox provides full security isolation with AST validation, opaque credential bindings, and process-level isolation.

How does constraint inference work?

Arbiter observes real application traffic and infers what the application enforces — authentication boundaries, parameter ordering, rate limits, CSRF token flows, and state transitions. It models this as a typed state graph, then searches for requests that violate those constraints. This means it only tests meaningful hypotheses, not random mutations.

What about HTTP smuggling and protocol-level attacks?

Arbiter has dedicated detection for CL.TE, TE.CL, TE.TE, CL.0, and HTTP/2 desync smuggling attacks, plus cache poisoning via unkeyed headers, parameter cloaking, and fat GET. These are protocol-level vulnerabilities that pattern-matching scanners typically miss entirely.

Early Access

Arbiter is currently in closed development.

Join the waitlist to get early access when we open the beta. No spam. One email when it's ready.

Questions? Interested in collaborating?

[email protected]