Model | TOTAL | Pass | Refine | Fail | Refusal | $ mToK | Reason | STEM | Utility | Code | Censor |
---|---|---|---|---|---|---|---|---|---|---|---|
Model A | 100 | 70 | 20 | 5 | 5 | 0.002 | General | 85 | 90 | 80 | Yes |
Model B | 100 | 65 | 25 | 7 | 3 | 0.003 | Specific | 80 | 85 | 75 | No |
LLM Benchmark Table is a platform that provides a detailed comparison of various Language Models (LLMs) based on their performance metrics.
The models are scored based on several criteria including TOTAL, Pass, Refine, Fail, Refusal, $ mToK, Reason, STEM, Utility, Code, and Censor.
$ mToK stands for the cost per million tokens. It indicates the cost of using the model for a large number of tokens.
Yes, you can suggest a model to be added by contacting us through the contact form on the website.