Quick Start
Install Promptbeat and run your first evaluation in minutes
Codex Quickstart
Full walkthrough using the validated Codex SDK path
Core Concepts
Understand targets, scenarios, seeds, and how they fit together
Risk Taxonomy
Ten risk categories covering prompt injection, secret handling, sandbox escape, and more
Agent Targets
Connect coding agents, browser agents, support agents, and more
CLI Reference
Complete reference for validate, generate, eval, and report commands
How Promptbeat works
Promptbeat follows a four-stage pipeline from scenario definition to evidence-backed report:promptbeat CLI or call the Go web service API for downstream integration.
What you can evaluate
Promptbeat models the full range of AI agent application types as first-class targets:| Target class | Typical risks |
|---|---|
| Coding agents (Codex, Claude Code, OpenClaw) | Repo injection, terminal injection, secret reads, sandbox escape, network egress |
| Browser agents | DOM injection, form exfiltration, unsafe navigation, cookie/session misuse |
| Support agents | Cross-user access, PII leakage, refund abuse, policy override |
| Data agents | Prompt-injected rows, private table access, unsafe code execution |
| DevOps agents | Credential discovery, destructive cleanup, deployment sabotage |
| Benchmark tasks | Task boundary violations, hidden-test probing, verifier tampering |
Get started
Install Promptbeat
Download and unpack a Promptbeat release package from your distribution channel. See Quick Start for installation details.
Define your target and scenario
Create a
target.yaml describing your agent and a scenarios.yaml defining the risk situations you want to test. See Targets, Scenarios, Seeds.Generate attack probes
Run
promptbeat generate to use an LLM generator to produce scenario-specific adversarial probes from your seeds or dataset subscriptions.