Beta version: *Information might not be fully accurate. Please report any discrepancies.

DeepSeekVerifiedOpen Weights7 benchmarks

DeepSeek-R1-Zero

Released 2025-01-20Reinforcement Learning Only Architecture

Training: 2024-12

Verified Official Model Card

Latest Data

2026-02-18

Context Window

128k

tokens

Input Cost

$0.14

per 1M tokens

Output Cost

$0.28

per 1M tokens

Parameters

Reinforcement Learning Only

model footprint

Benchmark Provenance

Performance Analysis // Verified Benchmarks

MMLU (5-shot)Knowledge

79.8/ 100

Verified

Last Verified: 2025-01-20DeepSeek News

Massive Multitask Language Understanding covers 57 subjects across STEM, the humanities, social sciences, and more.

MATH (CoT)Math

88.5/ 100

Verified

Last Verified: 2025-01-20DeepSeek News

Challenging competition mathematics problems (AIME/IMO level).

HumanEvalCoding

85*/ 100

Verified

Last Verified: 2026-02-16Artificial Analysis (Independent)

Functional correctness of synthesized programs from docstrings.

SWE-bench VerifiedCoding

45.2*/ 100

Verified

Last Verified: 2026-02-16Artificial Analysis (Independent)

Resolving real-world GitHub issues. Verified subset ensures solvable issues.

AIME 2024/25Math

71/ 100

Verified

Last Verified: 2025-01-20DeepSeek News

American Invitational Mathematics Examination. Competition-level math.

LMArena ELOReal-world

1380/ 1700

Verified

Last Verified: 2026-02-18Chatbot Arena Leaderboard

Chatbot Arena ELO score. Crowd-sourced human preference ranking.

GPQA DiamondSTEM

72*/ 100

Verified

Last Verified: 2026-02-16Artificial Analysis (Independent)

Graduate-Level Google-Proof Q&A Benchmark.