Actively finding real vulnerabilities

The lab bench for
AI security agents.

Arbiter turns captured web traffic into verified exploit evidence and bounty-ready reports. Aletheia gives agents structured binary analysis: lifting, SSA decompilation, taint, concolic proofs, and hybrid fuzzing. Both are Rust MCP servers built for the work agents can actually finish responsibly.

Explore Arbiter Explore Aletheia

Products

407

MCP Tools

85/85

Firing Range

100%

Rust

The Platform

Two tools. One mission.

Attack the web and reverse the binary. AI agents get structured, programmatic access to both.

Arbiter

Web security for autonomous agents

For penetration testers and bug bounty hunters

Import traffic from any source. Arbiter builds a state graph, infers authorization and ordering constraints, then searches for violations across 52 vulnerability classes. Every finding is verified in a real browser and exported with evidence. Not a scanner — a reasoning engine.

Race Conditions Browser Verification HTTP Smuggling WAF Bypass Bug Bounty Reports Cache Poisoning AI Security Static Analysis

267

MCP Tools

Vuln Classes

85/85

Firing Range

100%

Browser Verified

Learn more

Aletheia

Binary analysis for agents that need more than strings

For reverse engineers and malware analysts

Load PE, ELF, or Mach-O binaries. Aletheia disassembles four architectures, lifts to a 43-opcode IR, constructs SSA form, and decompiles to typed C. Taint, concolic, and hybrid fuzzing workflows detect 14 CWE classes with concrete witnesses, CVSS scoring, and SARIF output.

SSA Decompilation Concolic Falsification Hybrid Fuzzing Evasion Detection Taint Analysis Crypto Signatures MITRE ATT&CK Vulnerability Scanning

140

MCP Tools

CWE Classes

100%

CTF Detection

Architectures

Learn more

For the AI Security Era

Built for the post-Mythos security workflow

The next generation of models won’t be limited by whether they can imagine a vulnerability. They’ll be limited by whether they can inspect the right state, run the right experiment, verify the result, preserve evidence, and stay in scope — without burning context on plumbing. That’s the layer Arbiter Security builds.

Structured Tools

MCP APIs instead of GUI scraping. Typed inputs, JSON outputs, composable workflows. Fewer tool calls per finding, less context spent parsing — agents finish faster, with cleaner audit trails.

Verification First

Browser proof, concolic witnesses, SARIF, screenshots, traces, reproducible reports. No “potential” or “likely” — every finding ships with evidence a human can replay.

Responsible Control

Scope boundaries, audit logs, structured outputs ready for disclosure pipelines. The same instrumentation that helps an agent find a bug also helps you prove it stayed inside the lines.

The thesis: the bottleneck is no longer reasoning — it’s instrumentation, verification, and control.

Real Results

Proof, not promises.

Arbiter has been used to discover and responsibly disclose real vulnerabilities in production open source projects.

Disclosure • 2026

Anthropic Open Source

Vulnerability discovered in Anthropic's open source tooling — the company behind Claude. Responsibly disclosed and acknowledged by their security team.

Responsibly Disclosed

Disclosure • 2026

Cloudflare Open Source

Security issue identified in Cloudflare's open source infrastructure tooling, used by millions of websites globally. Ethically reported via their responsible disclosure program.

Responsibly Disclosed

Benchmark

Google Firing Range

100% detection rate across all 85 test endpoints in Google's XSS Firing Range — the industry standard benchmark for vulnerability detection accuracy.

85/85 Verified

Philosophy

Why we build this way

Existing security tools weren't built for AI agents. They have GUIs, not APIs. Heuristics, not reasoning. Pattern matching, not constraint inference. We started from scratch.

Agent-First

Human GUIs are bottlenecks. Every capability is exposed as a structured MCP tool with JSON input and output. AI agents can orchestrate entire security assessments autonomously.

Rust, From Scratch

No wrappers. No FFI to legacy C code. Both tools are built in pure Rust with memory safety, zero-GC performance, and TDD-first development. Every component is tested before the next begins.

Deterministic Output

Security tools must be reproducible. Same input, same output. Structured JSON responses, explicit error handling, and full audit trails — the kind of reliability agents can depend on.

Early Access

Both tools are in closed development.

Join the waitlist to get early access. Tell us which product you're interested in. No spam. One email when the beta opens.

Arbiter (Web Security) Aletheia (Binary Analysis) Both

Questions? Interested in collaborating?

[email protected]

The lab bench for AI security agents.

Two tools. One mission.

Arbiter

Aletheia

Built for the post-Mythos security workflow

Structured Tools

Verification First

Responsible Control

Proof, not promises.

Anthropic Open Source

Cloudflare Open Source

Google Firing Range

Why we build this way

Agent-First

Rust, From Scratch

Deterministic Output

Both tools are in closed development.

The lab bench for
AI security agents.