Back to Models

Gemma 4 31B IT

Gemma 4 31B IT is an open-weights dense multimodal model from Google DeepMind featuring a 256K context window, native video/audio processing, and advanced configurable reasoning capabilities under an Apache 2.0 license.

Thinking Mode
Parameters
31000000000 B
Context
262,144 tokens
Released
Invalid Date

Leaderboards

Average Score combining domain-specific Autobench scores; Higher is better

Performance vs. Industry Average

Intelligence

Gemma 4 31B IT is of lower intelligence compared to average (2.9), with an intelligence score of 2.8.

Price

Gemma 4 31B IT is cheaper compared to average ($0.75 per 1M Tokens) with a price of $0.02 per 1M Tokens.

Latency

Gemma 4 31B IT has a higher average latency compared to average (44.25s), with an average latency of 45.28s.

P99 Latency

Gemma 4 31B IT has a higher P99 latency compared to average (126.46s), taking 174.19s to receive the first token at P99 (TTFT).

Context Window

Gemma 4 31B IT has a smaller context window than average (406k tokens), with a context window of 262k tokens.

Gemma 4 31B IT - AutoBench