Skip to content

Leaderboard Benchmarks Explore Compare API Methodology

Beta version: *Information might not be fully accurate. Please report any discrepancies.

LLM Registry

Independent source of truth for LLM benchmark scores with provenance tracking and normalized rankings.

Leaderboards

Global Leaderboard Compare Models Explore (Price vs Performance)All Benchmarks

Resources

Methodology API Documentation GitHub

Data Sources

Artificial Analysis Models.dev

© 2026 LLM Registry

Report Inaccuracies Star on GitHub

LeaderboardLlama 4 Maverick

MetaVerifiedOpen Weights10 benchmarks

Llama 4 Maverick

Released 2025-09-30400B MoE (17B active, 128 experts) Architecture

Training: 2024-08

Verified Model Card

Latest Data

Unknown

Context Window

1.0M

tokens

Input Cost

$0.15

per 1M tokens

Output Cost

$0.60

per 1M tokens

Parameters

400B MoE (17B active, 128 experts)

model footprint

Benchmark Provenance

Performance Analysis // Verified Benchmarks

MMLU (5-shot)Knowledge

85.5/ 100

Verified

Last Verified: Unknown DateMeta AI

Massive Multitask Language Understanding covers 57 subjects across STEM, the humanities, social sciences, and more.

61.2/ 100

Verified

Last Verified: Unknown DateMeta AI

Challenging competition mathematics problems (AIME/IMO level).

HumanEvalCoding

85/ 100

Verified

Last Verified: Unknown DateMeta AI

Functional correctness of synthesized programs from docstrings.

MMMU (Multimodal)Multimodal

73.4/ 100

Verified

Last Verified: Unknown DateMeta AI

Multi-discipline Multimodal Understanding and Reasoning.

LMArena ELOReal-world

1401/ 1700

Verified

Last Verified: Unknown DateMeta AI

Chatbot Arena ELO score. Crowd-sourced human preference ranking.

MMLU-ProScience

80.5/ 100

Verified

Last Verified: Unknown DateMeta AI

A more robust and harder version of MMLU, focusing on complex reasoning and STEM subjects.

LiveCodeBench v6Coding

43.4/ 100

Verified

Last Verified: Unknown DateMeta AI

Contamination-free coding benchmark using recent problems.

GPQA DiamondSTEM

69.8/ 100

Verified

Last Verified: Unknown DateMeta AI

Graduate-Level Google-Proof Q&A Benchmark.

MathVistaVision

73.7/ 100

Verified

Last Verified: Unknown DateMeta AI

Mathematical reasoning in visual contexts.

DocVQAVision

94.4/ 100

Verified

Last Verified: Unknown DateMeta AI

Document visual question answering on scanned and digital documents.

Metadata

License

Open Weights

Context Window

1,000,000 tokens

Input Pricing

$0.15 / 1M tokens

Output Pricing

$0.60 / 1M tokens

Modality

textimagecodevision

Report Inaccuracy

Compare With

o1 o1-preview o1-mini