MMLU (5-shot)Knowledge
45.2*/ 100
Verified
Last Verified: 2026-02-16Artificial Analysis (Independent)
Massive Multitask Language Understanding covers 57 subjects across STEM, the humanities, social sciences, and more.
Beta version: *Information might not be fully accurate. Please report any discrepancies.
Latest Data
2026-02-18
Context Window
16k
tokens
Input Cost
Free
per 1M tokens
Output Cost
Free
per 1M tokens
Parameters
15B
model footprint
Performance Analysis // Verified Benchmarks
Massive Multitask Language Understanding covers 57 subjects across STEM, the humanities, social sciences, and more.
Functional correctness of synthesized programs from docstrings.
Next-generation HumanEval with more diverse library calls and complex tasks.
Chatbot Arena ELO score. Crowd-sourced human preference ranking.
Contamination-free coding benchmark using recent problems.
Graduate-Level Google-Proof Q&A Benchmark.