MMLU (5-shot)Knowledge
81/ 100
Verified
Last Verified: Unknown DateAnthropic News
Massive Multitask Language Understanding covers 57 subjects across STEM, the humanities, social sciences, and more.
Beta version: *Information might not be fully accurate. Please report any discrepancies.
Latest Data
Unknown
Context Window
200k
tokens
Input Cost
$1.00
per 1M tokens
Output Cost
$5.00
per 1M tokens
Cache Cost
$0.08 / $1.00
read / write per 1M
Parameters
Unknown
model footprint
Performance Analysis // Verified Benchmarks
Massive Multitask Language Understanding covers 57 subjects across STEM, the humanities, social sciences, and more.
Challenging competition mathematics problems (AIME/IMO level).
Functional correctness of synthesized programs from docstrings.
A more robust and harder version of MMLU, focusing on complex reasoning and STEM subjects.
Graduate-Level Google-Proof Q&A Benchmark.