Skip to content

fawp_index.benchmarks — Benchmark Suite

Five canonical ground-truth cases that verify the detector behaves correctly across the full detection landscape.

Quick start

from fawp_index import run_benchmarks

suite = run_benchmarks()
print(suite.summary())       # pass/fail table
suite.verify_all()           # raises BenchmarkFailure if any case fails
suite.to_html("bench.html")
suite.to_json("bench.json")

CLI

fawp-index benchmarks
fawp-index benchmarks --verify          # exit 1 if any fail
fawp-index benchmarks --out bench.html

The five cases

Case FAWP expected? Description
clean_control ✅ Yes Textbook FAWP: steering collapses, prediction survives
prediction_only ❌ No Predictable system, no steering channel
control_only ❌ No Active controller, no predictive horizon
noisy_false_positive ❌ No Noisy stable system designed to trap detectors
delayed_collapse ✅ Yes Fast-collapsing unstable system, narrow ODW

All five run in < 1 second (analytic curves, no simulation).

Run individual cases

from fawp_index import clean_control, delayed_collapse

case = clean_control()
case.verify()          # raises if wrong result
case.plot()            # leverage gap plot

case2 = delayed_collapse()
print(case2.result.summary())

Simulate instead of analytic curves

from fawp_index.benchmarks import clean_control
case = clean_control(simulate=True)   # runs FAWPSimulator

BenchmarkResult fields

Field Description
name Case name
passed Boolean
expected_fawp Expected detection outcome
actual_fawp Actual detection outcome
result Full ODWResult
notes Human-readable description