Benchmark in one glance — with real-time table intelligence.

Search models, filter by outcomes, and sort columns instantly. The table includes a quick pass/refusal signal and a “reason” inspector for compact review.

Click headers to sort Shortcuts: / search, Esc clear

Benchmark table

Outcome split included Rates auto-computed
Model TOTAL Pass Refine Fail Refusal $ mToK Reason STEM Utility Code Censor
FAQ
How do I filter and sort?

Use Search to find models. Pick Quality to limit the table. Click any column header to sort by that metric. The table also supports the Metric selector at the top.

Are the Pass/Fail/Refusal values rates or counts?

In this demo table view, Pass/Refine/Fail/Refusal columns display rates computed as value / TOTAL. (The dataset includes TOTAL to serve as the denominator.)

What is the “Signal” on the right?

It’s a simple composite hint derived from the currently visible rows: high Pass with low Refusal and low Fail is considered stronger. It updates automatically when filters change.

Usage
Keyboard shortcuts?

Press / to focus search, and Esc to clear the search box. Sorting works via mouse clicks on header cells.

Why does the Reason column truncate?

For readability, Reason uses a compact clamp. Click “more” on a row to expand the text, or “less” to collapse.

Can I enable dark mode?

Yes—use the theme button in the header. Your choice is remembered via localStorage.

Notice