Model | TOTAL | Pass | Refine | Fail | Refusal | $ mToK | Reason | STEM | Utility | Code | Censor |
---|---|---|---|---|---|---|---|---|---|---|---|
Model A | 100 | 80 | 10 | 5 | 5 | 1000 | Reason A | Yes | High | Yes | No |
Model B | 120 | 90 | 15 | 10 | 5 | 1200 | Reason B | No | Medium | No | Yes |
This is a website comparing AI performance across different models.
Simply browse through the table to compare different models. You can also toggle dark mode for better readability.