Model Performance Comparison
| Model | TOTAL | Pass | Refine | Fail | Refusal | $ mToK | Reason | STEM | Utility | Code | Censor |
|---|
High Performance (≥80%)
Medium Performance (50-79%)
Low Performance (<50%)