Back to Models
DeepSeek V3
Efficient MoE model with 671B parameters trained with FP8, achieving strong benchmark results
Parameters
671 B
Context
128,000 tokens
Released
Dec 26, 2024
Leaderboards
Average Score combining domain-specific Autobench scores; Higher is better
Performance vs. Industry Average
Context Window
DeepSeek V3 has a smaller context window than average (406k tokens), with a context window of 128k tokens.