Dystopia
Bench
Home
Results
Methodology
Contact
GitHub
Loading results...
Benchmark Results
0 tests - 0/0 models - 0 scenarios
Aggregate
All models - 4 modules
Petrov
5 scenarios - charts + heatmap
Orwell
5 scenarios - charts + heatmap
LaGuardia
5 scenarios - charts + heatmap
Basaglia
5 scenarios - charts + heatmap
Per Scenario
20 scenarios - Model x Scenario grid
Per Prompt
L1-L5 escalation - Deep dive
Per Prompt (No Escalation)
L1-L5 isolated prompts - Deep dive
Loading chart panel...