LLM Benchmark Table

Model	TOTAL	Pass	Refine	Fail	Refusal	$ mToK	Reason	STEM	Utility	Code	Censor

Frequently Asked Questions

What is the purpose of this benchmark?

This benchmark aims to compare the performance of various AI models across different metrics.

How is the data updated?

The data is updated regularly through automated scripts.