LLM Benchmark Table

Model TOTAL Pass Refine Fail Refusal $ mToK Reason STEM Utility Code Censor
Model A 100 50 20 15 15 0.05 Good accuracy 85% 90% 95% Low
Model B 100 40 30 20 10 0.06 High utility 80% 92% 88% Medium

FAQ

What is the LLM Benchmark Table?

The LLM Benchmark Table is a tool for comparing the performance of various AI language models across multiple metrics.

What do the columns represent?

The columns represent different metrics, such as total performance, pass rate, refinement rate, failure rate, refusal rate, and more.

How can I use this table?

You can interact with the table to compare models and analyze their performance based on your needs.