Model | TOTAL | Pass | Refine | Fail | Refusal | $ mToK | Reason | STEM | Utility | Code | Censor |
---|
Frequently Asked Questions
What is the purpose of this benchmark?
This benchmark aims to compare the performance of various AI models across different metrics.
How is the data updated?
The data is updated regularly through automated scripts.