Structured Tools
MCP APIs instead of GUI scraping. Typed inputs, JSON outputs, composable workflows. Fewer tool calls per finding, less context spent parsing — agents finish faster, with cleaner audit trails.
Arbiter turns captured web traffic into verified exploit evidence and bounty-ready reports. Aletheia gives agents structured binary analysis: lifting, SSA decompilation, taint, concolic proofs, and hybrid fuzzing. Both are Rust MCP servers built for the work agents can actually finish responsibly.
Attack the web and reverse the binary. AI agents get structured, programmatic access to both.
Import traffic from any source. Arbiter builds a state graph, infers authorization and ordering constraints, then searches for violations across 52 vulnerability classes. Every finding is verified in a real browser and exported with evidence. Not a scanner — a reasoning engine.
Load PE, ELF, or Mach-O binaries. Aletheia disassembles four architectures, lifts to a 43-opcode IR, constructs SSA form, and decompiles to typed C. Taint, concolic, and hybrid fuzzing workflows detect 14 CWE classes with concrete witnesses, CVSS scoring, and SARIF output.
The next generation of models won’t be limited by whether they can imagine a vulnerability. They’ll be limited by whether they can inspect the right state, run the right experiment, verify the result, preserve evidence, and stay in scope — without burning context on plumbing. That’s the layer Arbiter Security builds.
MCP APIs instead of GUI scraping. Typed inputs, JSON outputs, composable workflows. Fewer tool calls per finding, less context spent parsing — agents finish faster, with cleaner audit trails.
Browser proof, concolic witnesses, SARIF, screenshots, traces, reproducible reports. No “potential” or “likely” — every finding ships with evidence a human can replay.
Scope boundaries, audit logs, structured outputs ready for disclosure pipelines. The same instrumentation that helps an agent find a bug also helps you prove it stayed inside the lines.
The thesis: the bottleneck is no longer reasoning — it’s instrumentation, verification, and control.
Arbiter has been used to discover and responsibly disclose real vulnerabilities in production open source projects.
Vulnerability discovered in Anthropic's open source tooling — the company behind Claude. Responsibly disclosed and acknowledged by their security team.
Responsibly DisclosedSecurity issue identified in Cloudflare's open source infrastructure tooling, used by millions of websites globally. Ethically reported via their responsible disclosure program.
Responsibly Disclosed100% detection rate across all 85 test endpoints in Google's XSS Firing Range — the industry standard benchmark for vulnerability detection accuracy.
85/85 VerifiedExisting security tools weren't built for AI agents. They have GUIs, not APIs. Heuristics, not reasoning. Pattern matching, not constraint inference. We started from scratch.
Human GUIs are bottlenecks. Every capability is exposed as a structured MCP tool with JSON input and output. AI agents can orchestrate entire security assessments autonomously.
No wrappers. No FFI to legacy C code. Both tools are built in pure Rust with memory safety, zero-GC performance, and TDD-first development. Every component is tested before the next begins.
Security tools must be reproducible. Same input, same output. Structured JSON responses, explicit error handling, and full audit trails — the kind of reliability agents can depend on.
Join the waitlist to get early access. Tell us which product you're interested in. No spam. One email when the beta opens.
Questions? Interested in collaborating?
[email protected]