Beta version: *Information might not be fully accurate. Please report any discrepancies.

MinimaxVerified14 benchmarks

MiniMax 2.5

Released 2026-02-12230B MoE (10B active) Architecture

Training: 2025-01

Verified Model Card

Latest Data

2026-02-20

Context Window

200k

tokens

Input Cost

$0.30

per 1M tokens

Output Cost

$2.40

per 1M tokens

Cache Cost

$0.03 / Free

read / write per 1M

Parameters

230B MoE (10B active)

model footprint

Benchmark Provenance

Performance Analysis // Verified Benchmarks

MMLU (5-shot)Knowledge

89*/ 100

Verified

Last Verified: Unknown DateArtificial Analysis (Independent)

Massive Multitask Language Understanding covers 57 subjects across STEM, the humanities, social sciences, and more.

SWE-bench VerifiedCoding

80.2/ 100

Verified

Last Verified: Unknown DateMiniMax M2.7 Announcement

Resolving real-world GitHub issues. Verified subset ensures solvable issues.

LiveBenchReasoning

60.14/ 100

Verified

Last Verified: 2026-02-20LiveBench

Contamination-free, continuously updated reasoning benchmark.

AA Intelligence IndexReal-world

42*/ 100

Third-party

Last Verified: 2026-02-16Artificial Analysis (Independent)

Artificial Analysis aggregate intelligence index.

HLEScience

19.1*/ 100

Third-party

Last Verified: 2026-02-16Artificial Analysis (Independent)

Humanity's Last Exam - Hard reasoning benchmark without tools.

AA Coding IndexCoding

37.4*/ 100

Third-party

Last Verified: 2026-02-16Artificial Analysis (Independent)

Artificial Analysis aggregate coding capability index.

GPQA DiamondSTEM

84.8*/ 100

Third-party

Last Verified: 2026-02-16Artificial Analysis (Independent)

Graduate-Level Google-Proof Q&A Benchmark.

AA-LCRLong Context

66*/ 100

Third-party

Last Verified: 2026-02-16Artificial Analysis (Independent)

Artificial Analysis Long Context Reasoning benchmark. Evaluates reasoning over long contexts.

IFBenchInstruction Following

71.6*/ 100

Third-party

Last Verified: 2026-02-16Artificial Analysis (Independent)

Artificial Analysis IFBench. Evaluates precise instruction following with constraints.

AIME 2025Math

86.3/ 100

Verified

Last Verified: Unknown DateMiniMax M2.7 Announcement

American Invitational Mathematics Examination 2025 problems.

Terminal-Bench HardAgentic

34.8*/ 100

Third-party

Last Verified: 2026-02-16Artificial Analysis (Independent)

Hard split of Terminal-Bench focused on tougher terminal workflows.

SWE-bench ProAgentic

55.4/ 100

Verified

Last Verified: Unknown DateMiniMax M2.7 Announcement

Higher-difficulty SWE-bench subset for frontier coding agents.

TAU-Bench TelecomAgentic

95.3*/ 100

Third-party

Last Verified: 2026-02-16Artificial Analysis (Independent)

Telecom-domain tool-use and workflow benchmark.

SciCodeAdvanced Tasks

42.6*/ 100

Third-party

Last Verified: 2026-02-16Artificial Analysis (Independent)

Scientific programming benchmark for code synthesis and correctness.