Olmo 3.1 32b Think
A large-scale, 32-billion-parameter model designed for deep reasoning, complex multi-step logic, and advanced instruction following.
Leaderboards
QUALITY
Average Score combining domain-specific Autobench scores; Higher is better
- 4.48
- 4.43
- 4.39
- 4.38
- 4.32
- 4.29
- 4.29
- 4.20
- 4.18
- 4.17
- 4.17
- 4.13
- 4.12
- 4.11
- 4.11
- 4.06
- 4.06
- 3.99
- 3.88
- 3.86
- 3.78
- 3.47
PRICE
USD cent per average answer; Lower is better
- 0.07
- 0.08
- 0.09
- 0.11
- 0.33
- 0.34
- 0.54
- 0.71
- 0.91
- 0.99
- 1.25
- 1.30
- 1.86
- 2.12
- 3.79
- 3.94
- 6.48
- 7.36
- 8.12
- 10.80
- 11.39
- 17.26
- 81.88
LATENCY
Average Latency in Seconds; Lower is better
- 122.42s
- 20.42s
- 23.60s
- 30.08s
- 31.40s
- 38.77s
- 45.56s
- 51.84s
- 52.25s
- 61.46s
- 65.62s
- 66.78s
- 69.24s
- 75.48s
- 76.11s
- 82.80s
- 86.80s
- 89.96s
- 93.49s
- 99.62s
- 104.78s
- 110.95s
- 124.57s
- 130.10s
- 136.96s
- 144.01s
- 163.15s
- 169.73s
- 171.50s
- 180.11s
- 187.43s
- 227.43s
- 247.97s
- 261.38s
- 310.39s
Performance vs. Industry Average
Intelligence
Olmo 3.1 32b Think is of lower intelligence compared to average (4.1), with an intelligence score of 3.9.
Price
Olmo 3.1 32b Think is cheaper compared to average ($4.58 per 1M Tokens) with a price of $0.00 per 1M Tokens.
Latency
Olmo 3.1 32b Think has a higher average latency compared to average (116.45s), with an average latency of 122.42s.
P99 Latency
Olmo 3.1 32b Think has a lower P99 latency compared to average (339.37s), taking 270.44s to receive the first token at P99 (TTFT).
Context Window
Olmo 3.1 32b Think has a smaller context window than average (351k tokens), with a context window of 66k tokens.