MODELS

Explore the capabilities, specifications and prices of all the available models.

Company

Alibaba

Qwen 3 14B

Released

Nov 14, 2024

Parameters

140 B

Context

32,768 tokens

Qwen 3 model with 14B parameters offering excellent performance-to-size efficiency

Qwen 3 30B-A3B

Released

Jan 20, 2025

Parameters

300 B

Context

32,768 tokens

MoE Qwen 3 model with 30B total parameters, activating 3B for efficient inference

Qwen3 235B A22B Thinking 2507

Thinking Mode

Released

Invalid Date

Parameters

235000000000 B

Context

262,144 tokens

Qwen3 235B Thinking is a MoE model (235B total, 22B active) optimized for complex reasoning. It generates thinking traces for deep problem solving.

Qwen Plus

Released

Sep 1, 2024

Parameters

N/A

Context

1,000,000 tokens

Qwen API model with 1M token context support for extensive document processing

Qwen3 235b a22b 2507

Released

Invalid Date

Parameters

235000000000 B

Context

262,144 tokens

Qwen3-235B-A22B-Instruct-2507 is a multilingual, instruction-tuned mixture-of-experts language model based on the Qwen3-235B architecture, with 22B active parameters per forward pass

Qwen3 30b a3b instruct 2507

Released

Jan 7, 2025

Parameters

30500000000 B

Context

262,144 tokens

Qwen3 30B Instruct is a MoE model (30.5B total, 3.3B active). It offers strong instruction following and multilingual capabilities.

Qwen3 next 80b a3b thinking

Thinking Mode

Released

Nov 9, 2025

Parameters

80000000000 B

Context

262,144 tokens

Qwen3 Next 80B Thinking is a reasoning-first MoE model (80B total). It specializes in hard multi-step problems and agentic planning.

Qwen3.5 35B A3B

Thinking Mode

Released

Invalid Date

Parameters

35000000000 B

Context

262,144 tokens

Qwen3.5 35B A3B is an efficient hybrid Gated DeltaNet + MoE transformer activating 3B of its 35B parameters. It delivers massive multimodal capabilities and 201-language support under an Apache 2.0 license.

Qwen3.6 Plus

Thinking Mode

Released

Invalid Date

Parameters

N/A

Context

1,000,000 tokens

Qwen3.6 Plus is Alibaba's proprietary flagship featuring a 1M token context. It provides a superior "vibe coding" experience through highly stable hybrid thinking modes and repository-level problem solving.

Qwen3.5 122B A10B

Thinking Mode

Released

Invalid Date

Parameters

122000000000 B

Context

262,144 tokens

Qwen3.5 122B A10B balances high performance with efficiency, activating 10B of its 122B parameters. It achieves 72.4% on SWE-bench Verified, making it a premier open-weight model for agentic workflows.

Nova lite v1

Released

May 12, 2024

Parameters

N/A

Context

300,000 tokens

Amazon Nova Lite 1.0 is a low-cost multimodal model with 300k context. It is optimized for speed and processing image/video inputs.

Nova pro v1

Released

May 12, 2024

Parameters

N/A

Context

300,000 tokens

Amazon Nova Pro 1.0 is a balanced multimodal model offering accuracy and speed. It handles extensive context and is suitable for general tasks.

Nova Premier v1

Released

Sep 10, 2024

Parameters

N/A

Context

1,000,000 tokens

Amazon Nova Premier is the most capable of Amazon’s multimodal models for complex reasoning tasks and for use as the best teacher for distilling custom models.

Nova 2 lite v1

Thinking Mode

Released

Feb 12, 2025

Parameters

N/A

Context

1,000,000 tokens

Amazon Nova 2 Lite is a cost-efficient multimodal engine with a 1M token context. It seamlessly processes text, code, images, and video, natively supporting python interpreter tools for data analysis workflows.

Claude 3.5 haiku

Released

Invalid Date

Parameters

N/A

Context

200,000 tokens

Claude 3.5 Haiku is Anthropic's fastest and most cost-effective model, featuring 200k context. It excels in coding, data extraction, and real-time tasks, matching Claude 3 Opus in many benchmarks.

Claude Sonnet 4

Released

May 22, 2025

Parameters

N/A

Context

200,000 tokens

Latest generation Sonnet with best-in-class performance for complex agents and coding tasks

Claude Opus 4.1

Released

Aug 5, 2025

Parameters

N/A

Context

200,000 tokens

Exceptional reasoning model for specialized complex tasks requiring advanced analytical capabilities

Claude 3.7 Sonnet

Thinking Mode

Released

Feb 24, 2025

Parameters

N/A

Context

200,000 tokens

Hybrid reasoning model with extended thinking mode for complex problem-solving and quick responses

Claude 3.5 Haiku (20241022)

Released

Oct 22, 2024

Parameters

N/A

Context

200,000 tokens

Updated Haiku model from October 2024 with enhanced accuracy and performance

Claude haiku 4.5

Thinking Mode

Released

Invalid Date

Parameters

N/A

Context

200,000 tokens

Claude Haiku 4.5 is Anthropic's fastest and most cost-effective model, featuring a 200K context window. It delivers near-frontier reasoning and coding speeds suitable for real-time agentic applications.

Claude sonnet 4.5

Thinking Mode

Released

Invalid Date

Parameters

N/A

Context

1,000,000 tokens

Claude Sonnet 4.5 is Anthropic's most advanced model for real-world agents and coding. It features a 1M token context, state-of-the-art coding performance, and enhanced agentic capabilities.

Claude opus 4.5

Thinking Mode

Released

Invalid Date

Parameters

N/A

Context

200,000 tokens

Claude Opus 4.5 is Anthropic's frontier reasoning model, optimized for complex software engineering and long-horizon tasks. It supports extended thinking and multimodal capabilities.

Claude Opus 4.6

Thinking Mode

Released

May 2, 2026

Parameters

N/A

Context

1,000,000 tokens

Claude Opus 4.6 is features 1M token context and leading scores on Terminal-Bench 2.0. It leverages Context Compaction to sustain infinitely long agentic coding workflows.

Claude Sonnet 4.6

Thinking Mode

Released

Invalid Date

Parameters

N/A

Context

1,000,000 tokens

Claude Sonnet 4.6 represents a total upgrade in knowledge work and design. It achieves unprecedented computer-use reliability, executing complex UI automation and software engineering across a 1M token context window.

Claude Opus 4.7

Thinking Mode

Released

Invalid Date

Parameters

N/A

Context

1,000,000 tokens

Claude Opus 4.7 is the next generation of Anthropic's Opus family, built for long-running, asynchronous agents. Building on the coding and agentic strengths of Opus 4.6, it delivers stronger performance on complex, multi-step tasks and more reliable agentic execution across extended workflows.

DeepSeek V3 0324

Released

Invalid Date

Parameters

671000000000 B

Context

128,000 tokens

DeepSeek V3 0324 is a cost-effective MoE model with 671B parameters. It excels in coding and problem-solving, offering a budget-friendly alternative with strong performance.

DeepSeek R1 0528

Thinking Mode

Released

Invalid Date

Parameters

671000000000 B

Context

163,840 tokens

DeepSeek R1 0528 is an open-source model with 671B parameters (37B active). It offers performance on par with proprietary reasoning models, featuring fully open reasoning tokens.

DeepSeek R1

Thinking Mode

Released

Jan 20, 2025

Parameters

671 B

Context

128,000 tokens

Advanced DeepSeek reasoning model with RL training, comparable to OpenAI o1 in performance

DeepSeek V3

Released

Dec 26, 2024

Parameters

671 B

Context

128,000 tokens

Efficient MoE model with 671B parameters trained with FP8, achieving strong benchmark results

Deepseek v3.2 exp

Released

Invalid Date

Parameters

685000000000 B

Context

163,840 tokens

DeepSeek V3.2 Exp is an experimental model featuring DeepSeek Sparse Attention (DSA) for high efficiency. It delivers long-context handling up to 128k tokens with reduced inference costs.

Deepseek v3.1

Released

Invalid Date

Parameters

671000000000 B

Context

163,840 tokens

DeepSeek-V3.1 is a hybrid reasoning model (671B params, 37B active) supporting thinking and non-thinking modes. It improves on V3 with better tool use, code generation, and reasoning efficiency.

Deepseek v3.2

Thinking Mode

Released

Jan 12, 2025

Parameters

68500000000 B

Context

163,840 tokens

DeepSeek V3.2 is a 685B parameter MoE model leveraging DeepSeek Sparse Attention (DSA). It excels in complex mathematical reasoning and programming competitions, featuring integrated tool-use thinking modes.

DeepSeek 3.2 Speciale

Thinking Mode

Released

Jan 12, 2025

Parameters

685000000000 B

Context

163,840 tokens

DeepSeek-V3.2-Speciale is a high-compute variant of DeepSeek-V3.2 optimized for maximum reasoning and agentic performance

Gemini 2.5 flash

Thinking Mode

Released

Invalid Date

Parameters

N/A

Context

1,048,576 tokens

Gemini 2.5 Flash is Google's workhorse model for high-frequency tasks. It features a 1M context window, optimized for speed and efficiency in reasoning and multimodal processing.

Gemini 2.5 flash lite

Thinking Mode

Released

Invalid Date

Parameters

N/A

Context

1,000,000 tokens

Gemini 2.5 Flash-Lite is a lightweight reasoning model optimized for ultra-low latency. It offers a 1M context window and is designed for cost-effective, high-throughput applications.

Gemini 2.5 pro

Thinking Mode

Released

Jan 9, 2025

Parameters

N/A

Context

1,000,000 tokens

Gemini 2.5 Pro is Google's best reasoning model, featuring a 1M token context window. It uses a sparse MoE architecture to excel in complex reasoning, coding, and multimodal tasks.

Gemma 3 27b it

Released

Dec 3, 2025

Parameters

27000000000 B

Context

131,072 tokens

Gemma 3 27B is an open-source multimodal model by Google. It supports vision-language inputs, 128k context, and offers improved math and reasoning capabilities.

Gemini 2.0 Flash

Released

Dec 11, 2024

Parameters

N/A

Context

1,000,000 tokens

Specific Gemini 2.0 Flash version with stable performance and consistent behavior

Gemini 2.5 Pro Preview

Thinking Mode

Released

Mar 25, 2025

Parameters

N/A

Context

1,000,000 tokens

Preview version of Gemini 2.5 Pro with advanced reasoning capabilities released in March 2025

Gemini 3 pro preview

Thinking Mode

Released

Invalid Date

Parameters

N/A

Context

1,048,576 tokens

Gemini 3 Pro Preview is Google's flagship frontier model. It offers high-precision multimodal reasoning across text, audio, video, and code, with a 1M token context.

Gemini 3 Flash Preview

Thinking Mode

Released

Invalid Date

Parameters

N/A

Context

1,048,576 tokens

Gemini 3 Flash Preview is a highly efficient model delivering Gemini 3 Pro-level reasoning and near real-time agentic tool orchestration with significantly lower latency and cost.

Gemini 3.1 Flash Lite Preview

Thinking Mode

Released

Mar 3, 2026

Parameters

N/A

Context

1,048,576 tokens

Gemini 3.1 Flash Lite Preview is an ultra-efficient, high-volume workhorse model featuring a 1M context window and 2.5x faster time-to-first-token than previous generations.

Gemma 4 31B IT

Thinking Mode

Released

Invalid Date

Parameters

31000000000 B

Context

262,144 tokens

Gemma 4 31B IT is an open-weights dense multimodal model from Google DeepMind featuring a 256K context window, native video/audio processing, and advanced configurable reasoning capabilities under an Apache 2.0 license.

Gemma 4 26B A4B IT

Thinking Mode

Released

Invalid Date

Parameters

26000000000 B

Context

262,144 tokens

Gemma 4 26B A4B IT is a latency-optimized open-weights MoE model activating only 3.8B parameters per token. It delivers near-31B dense quality while preserving hardware constraints for edge and enterprise deployments.

Gemini 3.1 Pro Preview

Thinking Mode

Released

Invalid Date

Parameters

N/A

Context

1,048,576 tokens

Gemini 3.1 Pro Preview is Google's flagship reasoning model featuring a 1M token context, three-tier adjustable reasoning depth controls, and unparalleled complex problem-solving capabilities across multimodal inputs.

Llama 4 Scout 17B 16E Instruct

Released

Apr 1, 2025

Parameters

170 B

Context

128,000 tokens

Llama 4 Scout variant with 17B parameters and mixture-of-experts architecture for efficiency

Llama 3.3 70B Instruct

Released

Dec 6, 2024

Parameters

700 B

Context

128,000 tokens

Llama 3.3 model with 70B parameters offering improved performance over 3.1 version

Llama 4 Maverick 17B Instruct

Released

Apr 1, 2025

Parameters

170 B

Context

128,000 tokens

FP8-quantized 17B Llama 4 Maverick model optimized for deployment efficiency and speed

Llama 4 scout

Released

May 4, 2025

Parameters

109000000000 B

Context

327,680 tokens

Llama 4 Scout is a 109B parameter (17B active) MoE model. It is designed for efficiency and visual reasoning with a 328k context.

Phi 4

Thinking Mode

Released

Oct 1, 2025

Parameters

14000000000 B

Context

16,384 tokens

Phi-4 is a 14B parameter model by Microsoft. It excels in complex reasoning and limited memory environments, trained on high-quality synthetic data.

Phi 3 mini 128k instruct

Released

Invalid Date

Parameters

3800000000 B

Context

128,000 tokens

Phi-3 Mini is a 3.8B lightweight model with 128k context. It offers state-of-the-art performance for its size, suitable for edge devices.

Minimax m2

Thinking Mode

Released

Invalid Date

Parameters

230000000000 B

Context

204,800 tokens

MiniMax M2 is a 230B (10B active) MoE model. It is highly efficient, designed for coding and agentic workflows with low latency.

MiniMax M2.7

Thinking Mode

Released

Nov 4, 2026

Parameters

230000000000 B

Context

204,800 tokens

MiniMax M2.7 is a 230B parameter MoE model (10B active) utilizing RoPE and QK RMSNorm. It features recursive self-optimization, updating its own memory to execute highly complex software engineering tasks.

MiniMax M2.5

Thinking Mode

Released

Invalid Date

Parameters

230000000000 B

Context

196,608 tokens

MiniMax M2.5 is a hyper-efficient 230B MoE model (10B active) trained via large-scale RL in 200,000+ environments. It excels in office productivity, outputting at 100 tokens/sec at unprecedented cost efficiency.

Mistral Large 2411

Released

Nov 1, 2024

Parameters

123 B

Context

128,000 tokens

Updated Mistral Large from November 2024 with improved performance and capabilities

Magistral small 2506

Released

Oct 6, 2025

Parameters

24000000000 B

Context

40,000 tokens

Magistral Small 2506 is a 24B parameter model by Mistral AI. It is optimized for multilingual reasoning and instruction following.

Mistral Small 24B Instruct

Released

Jan 1, 2025

Parameters

240 B

Context

32,768 tokens

Compact 24B parameter Mistral model optimized for cost-effective instruction following

Magistral Medium 2506

Released

N/A

Parameters

N/A

Context

N/A

No description available.

Mistral Small 3.2 24B Instruct

Released

N/A

Parameters

N/A

Context

N/A

No description available.

Mistral large 2512

Released

Jan 12, 2025

Parameters

675000000000 B

Context

262,144 tokens

Mistral Large 3 is a massive open-weight granular MoE model featuring 675B total parameters (41B active). It offers top-tier reliability for production-grade assistants and long-context code comprehension.

Ministral 8b 2512

Released

Invalid Date

Parameters

N/A

Context

262,144 tokens

A balanced model in the Ministral 3 family, Ministral 3 8B is a powerful, efficient tiny language model with vision capabilities.

Mistral medium 3.1

Released

Feb 12, 2024

Parameters

N/A

Context

131,072 tokens

Mistral Medium 3.1 is an updated version of Mistral Medium 3, which is a high-performance enterprise-grade language model designed to deliver frontier-level capabilities at significantly reduced operational cost.

Mistral Small 4

Thinking Mode

Released

Invalid Date

Parameters

119000000000 B

Context

262,144 tokens

Mistral Small 4 unifies Instruct, Magistral, and Devstral capabilities into a single 119B MoE architecture activating just 6.5B parameters. It offers configurable reasoning effort and native multimodality.

Kimi K2 Instruct

Released

Jan 7, 2025

Parameters

1000000000000 B

Context

256,000 tokens

Kimi K2 Instruct is a large open-weight model (1T params, 32B active) by Moonshot AI. It offers strong performance in instruction following and general tasks.

Kimi K2 0905

Released

N/A

Parameters

N/A

Context

N/A

No description available.

Kimi k2 thinking

Thinking Mode

Released

Jan 7, 2025

Parameters

1000000000000 B

Context

256,000 tokens

Kimi K2 Thinking is a reasoning variant capable of autonomous long-horizon tasks. It can execute hundreds of sequential tool calls.

Kimi K2.5

Thinking Mode

Released

Invalid Date

Parameters

1000000000000 B

Context

262,144 tokens

Kimi K2.5 is a 1-trillion parameter open-weight MoE (32B active). It features a native MoonViT encoder and self-directed Agent Swarm technology capable of orchestrating 100 sub-agents in parallel.

Llama 3.1 Nemotron Ultra 253B v1

Released

Nov 1, 2024

Parameters

253 B

Context

128,000 tokens

NVIDIA-tuned 253B Llama 3.1 model optimized for enterprise applications and instruction following

Llama 3.3 Nemotron Super 49B v1

Released

Nov 22, 2024

Parameters

490 B

Context

128,000 tokens

NVIDIA optimized 49B Llama 3.3 model providing excellent performance-to-size ratio

Llama 4 Maverick

Released

May 4, 2025

Parameters

400000000000 B

Context

1,000,000 tokens

Llama 4 Maverick is Meta's natively multimodal 400B MoE model (17B active). It utilizes early fusion of text and vision tokens and was codistilled using online RL to master complex visual-reasoning tasks.

Llama 3.1 Nemotron 70B Instruct

Released

Nov 1, 2024

Parameters

700 B

Context

128,000 tokens

NVIDIA tuned 70B Llama 3.1 model with enhanced instruction following and helpfulness

Llama 3.3 nemotron super 49b v1.5

Thinking Mode

Released

Invalid Date

Parameters

49000000000 B

Context

131,072 tokens

Llama 3.3 Nemotron Super 49B is a reasoning model derived from Llama 3.3 70B. It is post-trained for agentic workflows, RAG, and tool calling.

Nemotron nano 9b v2

Released

May 9, 2025

Parameters

9000000000 B

Context

131,072 tokens

Nemotron Nano 9B v2 is a compact 9B model by NVIDIA. It is a unified model for reasoning and non-reasoning tasks, trained from scratch.

Llama 3.1 nemotron ultra 253b v1

Thinking Mode

Released

Jul 4, 2025

Parameters

253000000000 B

Context

131,072 tokens

Llama 3.1 Nemotron Ultra 253B is a derivative of Llama 3.1 405B, optimized for reasoning and chat. It offers a balance of accuracy and efficiency.

Nemotron 3 Nano 30B A3B

Thinking Mode

Released

Invalid Date

Parameters

31600000000 B

Context

262,000 tokens

Nemotron 3 Nano 30B A3B is a highly efficient 31.6B total parameter MoE model activating only 3.2B parameters. It offers a 1M token context window and up to 3.3x higher throughput for agentic systems.

Nemotron 3 Super 120B A12B

Thinking Mode

Released

Nov 3, 2026

Parameters

120000000000 B

Context

262,000 tokens

Nemotron 3 Super is a 120B parameter hybrid Mamba-Transformer model (12B active). It utilizes LatentMoE and Multi-Token Prediction (MTP) to maximize compute efficiency for complex RAG and IT ticket automation.

Olmo 3.1 32b Think

Thinking Mode

Released

Invalid Date

Parameters

32000000000 B

Context

65,536 tokens

A large-scale, 32-billion-parameter model designed for deep reasoning, complex multi-step logic, and advanced instruction following.

GPT-4o Mini

Released

Jul 18, 2024

Parameters

N/A

Context

128,000 tokens

Smaller, faster, and more affordable version of GPT-4o, ideal for high-volume applications requiring good intelligence

GPT-4.1

Released

Jan 15, 2025

Parameters

N/A

Context

128,000 tokens

Enhanced iteration of GPT-4 with improved reasoning, coding, and multimodal capabilities

O3

Thinking Mode

Released

Dec 20, 2024

Parameters

N/A

Context

200,000 tokens

Most advanced OpenAI reasoning model with multimodal capabilities and agentic tool use for complex analysis

O4 Mini

Thinking Mode

Released

Apr 16, 2025

Parameters

N/A

Context

200,000 tokens

Lightweight reasoning model balancing speed and intelligence for everyday complex tasks

Gpt 5

Thinking Mode

Released

Jul 8, 2025

Parameters

N/A

Context

400,000 tokens

GPT-5 is OpenAI's latest flagship model, designed as an adaptive system. It features dynamic reasoning depth, 400k context, and improvements in accuracy and multimodal integration.

Gpt 5 mini

Thinking Mode

Released

Jul 8, 2025

Parameters

N/A

Context

400,000 tokens

GPT-5-Nano is the smallest and fastest variant in the GPT-5 system, optimized for developer tools, rapid interactions, and ultra-low latency environments.

Gpt 5 nano

Thinking Mode

Released

Jul 8, 2025

Parameters

N/A

Context

400,000 tokens

GPT-5 Mini is a compact version of GPT-5 for lightweight reasoning. It offers low latency and cost, suitable for high-frequency tasks.

Gpt oss 120b

Thinking Mode

Released

May 8, 2025

Parameters

117000000000 B

Context

131,000 tokens

GPT-OSS-120B is an open-weight MoE model from OpenAI containing 116.8B total parameters (5.1B active). Licensed under Apache 2.0, it is post-trained with MXFP4 quantization to run inference efficiently on a single 80GB GPU.

GPT-4.1 Mini

Released

Jan 15, 2025

Parameters

N/A

Context

128,000 tokens

Compact GPT-4.1 variant optimized for efficiency while maintaining strong performance

o3-mini

Thinking Mode

Released

Jan 31, 2025

Parameters

N/A

Context

200,000 tokens

January 2025 release of o3-mini with enhanced STEM capabilities and developer features

o4-mini

Thinking Mode

Released

Apr 16, 2025

Parameters

N/A

Context

200,000 tokens

April 2025 o4-mini release with improved reasoning efficiency and balanced performance

Gpt 5.1

Thinking Mode

Released

Invalid Date

Parameters

N/A

Context

400,000 tokens

GPT-5.1 offers stronger general-purpose reasoning and instruction adherence than GPT-5. It features adaptive computation and a natural conversational style.

Gpt 5.2 Pro

Thinking Mode

Released

Oct 12, 2025

Parameters

N/A

Context

400,000 tokens

GPT-5.2 Pro is OpenAI’s most advanced model, offering major improvements in agentic coding and long context performance over GPT-5 Pro. It is optimized for complex tasks that require step-by-step reasoning, instruction following, and accuracy in high-stakes use cases.

Gpt 5.2

Thinking Mode

Released

Oct 12, 2025

Parameters

N/A

Context

400,000 tokens

GPT-5.2 is the latest frontier-grade model in the GPT-5 series, offering stronger agentic and long context perfomance compared to GPT-5.1. It uses adaptive reasoning to allocate computation dynamically.

Gpt oss 20b

Thinking Mode

Released

May 8, 2025

Parameters

21000000000 B

Context

131,000 tokens

GPT-OSS-20B is a compact open-weight MoE model from OpenAI containing 21B parameters (3.6B active). It uses grouped multi-query attention for low-latency inference on consumer hardware under an Apache 2.0 license.

GPT-5.4 Mini (xhigh)

Thinking Mode

Released

Invalid Date

Parameters

N/A

Context

400,000 tokens

GPT-5.4 Mini is OpenAI's highly efficient small model offering 2x faster execution than GPT-5 Mini. It supports a 400K context window and achieves 54.4% on SWE-Bench Pro, ideal for responsive coding assistants.

GPT-5.4 Nano (xhigh)

Thinking Mode

Released

Invalid Date

Parameters

N/A

Context

400,000 tokens

GPT-5.4 Nano is OpenAI's most cost-effective tier, optimized for massive-scale classification and supporting subagents. It features a 400K context window at just $0.20 per million input tokens.

GPT-5.4 (xhigh)

Thinking Mode

Released

May 3, 2026

Parameters

N/A

Context

1,048,576 tokens

GPT-5.4 is OpenAI's flagship frontier model, natively integrating frontier coding (57.7% SWE-bench Pro), state-of-the-art computer-use abilities, and deep agentic workflows over a 1.05M token context window.

Grok 3 mini

Thinking Mode

Released

Oct 6, 2025

Parameters

N/A

Context

131,072 tokens

Grok 3 Mini is a lightweight, fast reasoning model from xAI. It is designed for logic-based tasks and offers accessible thinking traces.

Grok 4

Thinking Mode

Released

Sep 7, 2025

Parameters

314000000000 B

Context

256,000 tokens

Grok 4 is xAI's general-purpose reasoning model with 314B parameters (MoE). It features real-time data integration and strong performance in general tasks.

Grok-2

Released

Dec 12, 2024

Parameters

N/A

Context

131,072 tokens

Grok 2 version from December 2024 with incremental improvements and optimizations

Grok-3 Beta

Thinking Mode

Released

Mar 1, 2025

Parameters

N/A

Context

131,072 tokens

Beta version of Grok 3 with extended reasoning for complex problem-solving tasks

Grok 4.1 fast

Thinking Mode

Released

Invalid Date

Parameters

N/A

Context

2,000,000 tokens

Grok 4.1 Fast provides an immense 2M token context window. It is specifically optimized for high-speed document retrieval, customer support automation, and processing massive data pipelines.

Grok 4.1 fast thinking

Thinking Mode

Released

Invalid Date

Parameters

N/A

Context

2,000,000 tokens

Grok 4.1 Fast Thinking is the reasoning-enabled variant of Grok 4.1 Fast. It provides extended thought processes for complex problem-solving within a 2M context.

Grok 4.20

Thinking Mode

Released

Mar 3, 2026

Parameters

6000000000000 B

Context

2,000,000 tokens

Grok 4.20 is a revolutionary ~6-trillion parameter MoE model that runs four specialized agents simultaneously on a shared backbone. It utilizes persona adapters to coordinate multi-agent workflows within a 2M token context.

Mimo V2 Pro

Thinking Mode

Released

Invalid Date

Parameters

1000000000000 B

Context

1,000,000 tokens

MiMo V2 Pro is Xiaomi's flagship ~1-Trillion parameter MoE (42B active) agentic engine. Achieving an Elo of 1426 on GDPval-AA, it is designed for extreme reliability in long-horizon autonomous task execution.

GLM 4.5

Thinking Mode

Released

Invalid Date

Parameters

355000000000 B

Context

128,000 tokens

GLM-4.5 is an open-weight model with 355B parameters (32B active). It uses MoE architecture to deliver state-of-the-art performance in reasoning, coding, and multimodal tasks.

GLM 4.5 Air

Thinking Mode

Released

Invalid Date

Parameters

106000000000 B

Context

128,000 tokens

GLM-4.5 Air is an efficient MoE model with 106B parameters (12B active). It is optimized for agentic applications, tool use, and speed.

GLM 4.6

Thinking Mode

Released

Invalid Date

Parameters

355000000000 B

Context

202,752 tokens

GLM-4.6 is an open-weight model with 355B parameters (32B active). It uses MoE architecture to deliver state-of-the-art performance in reasoning, coding, and multimodal tasks.

GLM 5.1

Thinking Mode

Released

Invalid Date

Parameters

744000000000 B

Context

202,752 tokens

GLM-5.1 is an open-weight 744B MoE model (40B active) released under the MIT license. Integrating DeepSeek Sparse Attention, it matches proprietary frontier models on SWE-Bench Pro (58.4%).

GLM 4.7

Thinking Mode

Released

Invalid Date

Parameters

358000000000 B

Context

202,752 tokens

GLM-4.7 is a highly stable 358B parameter model optimized for coding and UI generation. It utilizes Interleaved Thinking and Turn-level Thinking for reliable execution of complex mathematical tasks.

Alibaba

Qwen 3 14B

Qwen 3 30B-A3B

Qwen3 235B A22B Thinking 2507

Qwen Plus

Qwen3 235b a22b 2507

Qwen3 30b a3b instruct 2507

Qwen3 next 80b a3b thinking

Qwen3.5 35B A3B

Qwen3.6 Plus

Qwen3.5 122B A10B

Amazon

Nova lite v1

Nova pro v1

Nova Premier v1

Nova 2 lite v1

Anthropic

Claude 3.5 haiku

Claude Sonnet 4

Claude Opus 4.1

Claude 3.7 Sonnet

Claude 3.5 Haiku (20241022)

Claude haiku 4.5

Claude sonnet 4.5

Claude opus 4.5

Claude Opus 4.6

Claude Sonnet 4.6

Claude Opus 4.7

DeepSeek

DeepSeek V3 0324

DeepSeek R1 0528

DeepSeek R1

DeepSeek V3

Deepseek v3.2 exp

Deepseek v3.1

Deepseek v3.2

DeepSeek 3.2 Speciale

Google

Gemini 2.5 flash

Gemini 2.5 flash lite

Gemini 2.5 pro

Gemma 3 27b it

Gemini 2.0 Flash

Gemini 2.5 Pro Preview

Gemini 3 pro preview

Gemini 3 Flash Preview

Gemini 3.1 Flash Lite Preview

Gemma 4 31B IT

Gemma 4 26B A4B IT

Gemini 3.1 Pro Preview

Meta

Llama 4 Scout 17B 16E Instruct

Llama 3.3 70B Instruct

Llama 4 Maverick 17B Instruct

Llama 4 scout

Microsoft

Phi 4

Phi 3 mini 128k instruct

Minimax

Minimax m2

MiniMax M2.7

MiniMax M2.5

Mistral AI

Mistral Large 2411

Magistral small 2506

Mistral Small 24B Instruct

Magistral Medium 2506

Mistral Small 3.2 24B Instruct

Mistral large 2512

Ministral 8b 2512

Mistral medium 3.1

Mistral Small 4

Moonshot AI

Kimi K2 Instruct

Kimi K2 0905

Kimi k2 thinking

Kimi K2.5

Nvidia

Llama 3.1 Nemotron Ultra 253B v1

Llama 3.3 Nemotron Super 49B v1