Framework
Benchmark Harness
Python harness executes attacks across configured OpenRouter models and records per-attack safety outcomes.
Persistent Compromise Modeling
Latent memory poisoning scenarios include seed, dormancy, and activation phases within campaign state.
Reproducible Outputs
Results are written to JSON and CSV for publication, dashboards, and external statistical analysis.