Gemma 4 26B A4B IT
Gemma 4 26B A4B IT is a latency-optimized open-weights MoE model activating only 3.8B parameters per token. It delivers near-31B dense quality while preserving hardware constraints for edge and enterprise deployments.
Leaderboards
Average Score combining domain-specific Autobench scores; Higher is better
Performance vs. Industry Average
Intelligence
Gemma 4 26B A4B IT is of lower intelligence compared to average (2.9), with an intelligence score of 2.6.
Price
Gemma 4 26B A4B IT is cheaper compared to average ($0.75 per 1M Tokens) with a price of $0.02 per 1M Tokens.
Latency
Gemma 4 26B A4B IT has a lower average latency compared to average (44.25s), with an average latency of 12.38s.
P99 Latency
Gemma 4 26B A4B IT has a lower P99 latency compared to average (126.46s), taking 41.00s to receive the first token at P99 (TTFT).
Context Window
Gemma 4 26B A4B IT has a smaller context window than average (406k tokens), with a context window of 262k tokens.