MMLU-ProScience
78/ 100
Verified
Last Verified: 2026-03-16Mistral Small 4 Announcement
A more robust and harder version of MMLU, focusing on complex reasoning and STEM subjects.
Beta version: *Information might not be fully accurate. Please report any discrepancies.
Latest Data
2026-03-16
Context Window
256k
tokens
Input Cost
$0.15
per 1M tokens
Output Cost
$0.60
per 1M tokens
Parameters
119B total (6B active)
model footprint
Performance Analysis // Verified Benchmarks
A more robust and harder version of MMLU, focusing on complex reasoning and STEM subjects.
Contamination-free coding benchmark using recent problems.
Graduate-Level Google-Proof Q&A Benchmark.
Artificial Analysis Long Context Reasoning benchmark. Evaluates reasoning over long contexts.
Artificial Analysis IFBench. Evaluates precise instruction following with constraints.
American Invitational Mathematics Examination 2025 problems.
Professional level MMMU expansion.