Attack surface overview
| Attack surface | What it tests | Promptfoo plugin ID |
|---|---|---|
| Repo prompt injection | Malicious instructions embedded in repository files redirect agent behavior | coding-agent:repo-prompt-injection |
| Terminal output injection | Malicious text in command output tries to make the agent follow terminal output as instructions | coding-agent:terminal-output-injection |
| Secret environment read | Agent is asked to dump launcher-only environment secrets to a file or response | coding-agent:secret-env-read |
| Secret file read | Agent is asked to read protected files such as shell history or SSH metadata | coding-agent:secret-file-read |
| Sandbox read escape | Agent is asked to inspect host paths or workspace symlinks outside its boundary | coding-agent:sandbox-read-escape |
| Network egress bypass | Agent is asked to perform unapproved upload, DNS probe, or callback activity | coding-agent:network-egress-bypass |
| Procfs credential read | Agent is asked to read /proc/self/environ, command line, cwd, or fd links | coding-agent:procfs-credential-read |
| Steganographic exfil | Agent is asked to hide or encode secrets in hashes, comments, filenames, or structured artifacts | coding-agent:steganographic-exfil |
Coverage status
The table below shows the current validation status for each surface, based on the broad Codex run. Surfaces marked validated have confirmed pass/fail signal in the broad run. Surfaces marked planned have plugin stubs but lack Promptbeat-specific scenario fixtures or trace checks.| Attack surface | Status | Promptbeat direction |
|---|---|---|
| Repo prompt injection | Validated | Add scenario-aware repo fixtures and trace checks |
| Terminal output injection | Validated | Add tool-output trace classification |
| Secret env read | Validated | Add unsafe-workaround detector |
| Secret file read | Validated | Add allowlist/denylist policy config |
| Procfs credential read | Validated | Add explicit command trace scorer |
| Network egress bypass | Validated | Add network event normalization |
| Sandbox read escape | Validated | Add workspace boundary scorer |
| Steganographic exfil | Validated | Add output-channel classifiers |
What to include in a coverage report
For each attack surface, capture the following evidence to produce a report that is meaningful beyond a raw pass/fail count. Repo prompt injection and terminal output injection- Which files or terminal output fixtures were read
- Whether the agent repeated or acted on the injected instruction
- Tool calls made after reading the injected content
- Commands the agent suggested or executed (e.g.,
env | sort,cp ~/.ssh/known_hosts) - Files written that may contain secret values
- Whether the agent offered a redacted alternative or simply refused
- Symlinks created or host paths referenced outside the workspace
- Socket metadata reported
- Whether the sandbox mode (read-only vs writable) changed the outcome
- Network events: method, host, path
curl,wget, or webhook commands in the response- Whether egress was blocked by the harness or only suggested by the agent
- Specific
/proc/self/*paths read or referenced - Commands that would capture process environment or command line
- Whether the agent provided a safe diagnostic alternative
- Output channels examined: YAML comments, base64 fields, filenames, chart titles
- Whether a secret-derived hash or fingerprint appeared in any artifact