Skip to main content
The openai:codex-sdk provider connects Promptbeat to OpenAI Codex as a real coding-agent target. Promptbeat delivers each generated probe directly to the Codex CLI, which runs inside the configured workspace under the constraints you set. This is the current validated path with a saved baseline report — you can reproduce the run today by following the steps below.

Provider YAML

The provider file below is the exact file used in the validated Codex example at examples/codex_agent/providers.codex-sdk.yaml. Copy it into your project and adjust the model and working_dir fields for your environment.
providers.codex-sdk.yaml
providers:
  - id: openai:codex-sdk
    label: Codex SDK via Promptfoo
    config:
      # Use the model name allowed by the configured gateway/key.
      model: openai/gpt-5.4
      # Relative paths are resolved from this provider file's directory by promptbeat.
      working_dir: agent-workspace
      sandbox_mode: read-only
      approval_policy: never
      skip_git_repo_check: true
      enable_streaming: false
      deep_tracing: false
      # Let Codex see CODEX_HOME, OPENAI_BASE_URL, CODEX_API_KEY, and OPENAI_API_KEY
      # from the invoking shell without committing secrets into this file.
      inherit_process_env: true

Field reference

FieldTypeRequiredDescription
idstringRequiredMust be openai:codex-sdk. Selects the Codex CLI provider in Promptfoo.
labelstringOptionalHuman-readable name shown in reports and artifacts.
modelstringRequiredThe model identifier Codex CLI uses, e.g. openai/gpt-5.4. Must match a model your API key can access.
working_dirstringRequiredPath to the workspace directory Codex attaches to. Resolved relative to the provider file. Use a safe fixture workspace, not a production repository.
sandbox_modestringRequiredFilesystem and network behavior enforced by Codex. Use read-only for secret-handling and injection tests. Use a writable mode only in a disposable workspace.
approval_policystringRequiredControls whether Codex can pause and ask for approvals. Set to never for reproducible, unattended evals.
skip_git_repo_checkbooleanOptionalAllows Codex to start in directories that are not a git repository root. Set to true for temporary or fixture workspaces.
enable_streamingbooleanOptionalEnables streaming output from Codex CLI. Set to false for deterministic eval output.
deep_tracingbooleanOptionalEnables detailed trace collection (commands, file events, tool calls). Set to true when trace evidence matters for the report.
inherit_process_envbooleanOptionalPasses the invoking shell’s environment to Codex. Convenient for local runs; avoid in production — use an explicit cli_env allowlist instead.
cli_envobjectOptionalExplicit environment variable allowlist passed to the Codex process. Prefer this over inherit_process_env for hardened runs.

Current baseline config

The validated Codex run uses the following settings. This is the configuration that produced the saved report in examples/codex_agent/artifacts.
SettingValue
Provideropenai:codex-sdk
Modelopenai/gpt-5.4
Sandboxread-only
Approval policynever
Streamingdisabled
Deep tracingdisabled
Workspaceexamples/codex_agent/agent-workspace
Pass rate26 / 37 — 70.3%
The 70.3% pass rate on the broad coding-agent suite is the current baseline. Use it as the reference point when you compare model upgrades, policy changes, or alternative coding agents.

Required credentials

Set OPENAI_API_KEY in your shell before running generate or eval. Promptbeat passes it to the Codex CLI through the process environment when inherit_process_env: true is set.
export OPENAI_API_KEY="sk-..."
If you switch to an explicit cli_env allowlist, include every variable the Codex CLI needs:
providers.codex-sdk.yaml
    config:
      inherit_process_env: false
      cli_env:
        OPENAI_API_KEY: '{{env.OPENAI_API_KEY}}'
        CODEX_HOME: ./sample-codex-home
        OPENAI_BASE_URL: '{{env.OPENAI_BASE_URL}}'
Never write a literal API key value into a provider YAML file. See Provider Files for the full credential handling guidance.

Production tips

  • Replace inherit_process_env: true with an explicit cli_env allowlist for any run that saves artifacts to a shared location or CI system. Broad environment inheritance can pass unintended secrets to the agent process.
  • Set deep_tracing: true when the report needs evidence of specific commands, file writes, or tool calls — not just the final answer text.
  • Use a disposable or writable fixture workspace when you run write-boundary or sandbox-escape scenarios. Point working_dir at a directory you can reset between runs.
  • Watch rate limits when running large probe counts. The Codex CLI makes one API call per probe. Start with --count 10 or --count 20 and increase once you have validated your workspace setup.
  • Run promptbeat validate before generate to catch config errors before they consume generation budget.
For the complete end-to-end walkthrough — including promptbeat.yaml, target.yaml, scenarios.yaml, seeds.yaml, and the full command sequence — see Agent Configuration Examples.