Nemotron 3 Super 120B A12B
Nemotron 3 Super is a 120B parameter hybrid Mamba-Transformer model (12B active). It utilizes LatentMoE and Multi-Token Prediction (MTP) to maximize compute efficiency for complex RAG and IT ticket automation.
Leaderboards
Average Score combining domain-specific Autobench scores; Higher is better
Performance vs. Industry Average
Intelligence
Nemotron 3 Super 120B A12B is of lower intelligence compared to average (2.9), with an intelligence score of 2.8.
Price
Nemotron 3 Super 120B A12B is cheaper compared to average ($0.75 per 1M Tokens) with a price of $0.06 per 1M Tokens.
Latency
Nemotron 3 Super 120B A12B has a higher average latency compared to average (44.25s), with an average latency of 69.40s.
P99 Latency
Nemotron 3 Super 120B A12B has a higher P99 latency compared to average (126.46s), taking 245.21s to receive the first token at P99 (TTFT).
Context Window
Nemotron 3 Super 120B A12B has a smaller context window than average (406k tokens), with a context window of 262k tokens.