MODELS

Explore the capabilities, specifications and prices of all the available models.

Company

Alibaba

Qwen 3 14B

Released
Nov 14, 2024
Parameters
140 B
Context
32,768 tokens

Qwen 3 model with 14B parameters offering excellent performance-to-size efficiency

Read more

Qwen 3 30B-A3B

Released
Jan 20, 2025
Parameters
300 B
Context
32,768 tokens

MoE Qwen 3 model with 30B total parameters, activating 3B for efficient inference

Read more

Qwen3 235B A22B Thinking 2507

Thinking Mode
Released
Invalid Date
Parameters
235000000000 B
Context
262,144 tokens

Qwen3 235B Thinking is a MoE model (235B total, 22B active) optimized for complex reasoning. It generates thinking traces for deep problem solving.

Read more

Qwen Plus

Released
Sep 1, 2024
Parameters
N/A
Context
1,000,000 tokens

Qwen API model with 1M token context support for extensive document processing

Read more

Qwen3 235b a22b 2507

Released
Invalid Date
Parameters
235000000000 B
Context
262,144 tokens

Qwen3-235B-A22B-Instruct-2507 is a multilingual, instruction-tuned mixture-of-experts language model based on the Qwen3-235B architecture, with 22B active parameters per forward pass

Read more

Qwen3 30b a3b instruct 2507

Released
Jan 7, 2025
Parameters
30500000000 B
Context
262,144 tokens

Qwen3 30B Instruct is a MoE model (30.5B total, 3.3B active). It offers strong instruction following and multilingual capabilities.

Read more

Qwen3 next 80b a3b thinking

Thinking Mode
Released
Nov 9, 2025
Parameters
80000000000 B
Context
262,144 tokens

Qwen3 Next 80B Thinking is a reasoning-first MoE model (80B total). It specializes in hard multi-step problems and agentic planning.

Read more

Qwen3.5 35B A3B

Thinking Mode
Released
Invalid Date
Parameters
35000000000 B
Context
262,144 tokens

Qwen3.5 35B A3B is an efficient hybrid Gated DeltaNet + MoE transformer activating 3B of its 35B parameters. It delivers massive multimodal capabilities and 201-language support under an Apache 2.0 license.

Read more

Qwen3.6 Plus

Thinking Mode
Released
Invalid Date
Parameters
N/A
Context
1,000,000 tokens

Qwen3.6 Plus is Alibaba's proprietary flagship featuring a 1M token context. It provides a superior "vibe coding" experience through highly stable hybrid thinking modes and repository-level problem solving.

Read more

Qwen3.5 122B A10B

Thinking Mode
Released
Invalid Date
Parameters
122000000000 B
Context
262,144 tokens

Qwen3.5 122B A10B balances high performance with efficiency, activating 10B of its 122B parameters. It achieves 72.4% on SWE-bench Verified, making it a premier open-weight model for agentic workflows.

Read more

Amazon

Nova lite v1

Released
May 12, 2024
Parameters
N/A
Context
300,000 tokens

Amazon Nova Lite 1.0 is a low-cost multimodal model with 300k context. It is optimized for speed and processing image/video inputs.

Read more

Nova pro v1

Released
May 12, 2024
Parameters
N/A
Context
300,000 tokens

Amazon Nova Pro 1.0 is a balanced multimodal model offering accuracy and speed. It handles extensive context and is suitable for general tasks.

Read more

Nova Premier v1

Released
Sep 10, 2024
Parameters
N/A
Context
1,000,000 tokens

Amazon Nova Premier is the most capable of Amazon’s multimodal models for complex reasoning tasks and for use as the best teacher for distilling custom models.

Read more

Nova 2 lite v1

Thinking Mode
Released
Feb 12, 2025
Parameters
N/A
Context
1,000,000 tokens

Amazon Nova 2 Lite is a cost-efficient multimodal engine with a 1M token context. It seamlessly processes text, code, images, and video, natively supporting python interpreter tools for data analysis workflows.

Read more

Anthropic

Claude 3.5 haiku

Released
Invalid Date
Parameters
N/A
Context
200,000 tokens

Claude 3.5 Haiku is Anthropic's fastest and most cost-effective model, featuring 200k context. It excels in coding, data extraction, and real-time tasks, matching Claude 3 Opus in many benchmarks.

Read more

Claude Sonnet 4

Released
May 22, 2025
Parameters
N/A
Context
200,000 tokens

Latest generation Sonnet with best-in-class performance for complex agents and coding tasks

Read more

Claude Opus 4.1

Released
Aug 5, 2025
Parameters
N/A
Context
200,000 tokens

Exceptional reasoning model for specialized complex tasks requiring advanced analytical capabilities

Read more

Claude 3.7 Sonnet

Thinking Mode
Released
Feb 24, 2025
Parameters
N/A
Context
200,000 tokens

Hybrid reasoning model with extended thinking mode for complex problem-solving and quick responses

Read more

Claude 3.5 Haiku (20241022)

Released
Oct 22, 2024
Parameters
N/A
Context
200,000 tokens

Updated Haiku model from October 2024 with enhanced accuracy and performance

Read more

Claude haiku 4.5

Thinking Mode
Released
Invalid Date
Parameters
N/A
Context
200,000 tokens

Claude Haiku 4.5 is Anthropic's fastest and most cost-effective model, featuring a 200K context window. It delivers near-frontier reasoning and coding speeds suitable for real-time agentic applications.

Read more

Claude sonnet 4.5

Thinking Mode
Released
Invalid Date
Parameters
N/A
Context
1,000,000 tokens

Claude Sonnet 4.5 is Anthropic's most advanced model for real-world agents and coding. It features a 1M token context, state-of-the-art coding performance, and enhanced agentic capabilities.

Read more

Claude opus 4.5

Thinking Mode
Released
Invalid Date
Parameters
N/A
Context
200,000 tokens

Claude Opus 4.5 is Anthropic's frontier reasoning model, optimized for complex software engineering and long-horizon tasks. It supports extended thinking and multimodal capabilities.

Read more

Claude Opus 4.6

Thinking Mode
Released
May 2, 2026
Parameters
N/A
Context
1,000,000 tokens

Claude Opus 4.6 is features 1M token context and leading scores on Terminal-Bench 2.0. It leverages Context Compaction to sustain infinitely long agentic coding workflows.

Read more

Claude Sonnet 4.6

Thinking Mode
Released
Invalid Date
Parameters
N/A
Context
1,000,000 tokens

Claude Sonnet 4.6 represents a total upgrade in knowledge work and design. It achieves unprecedented computer-use reliability, executing complex UI automation and software engineering across a 1M token context window.

Read more

Claude Opus 4.7

Thinking Mode
Released
Invalid Date
Parameters
N/A
Context
1,000,000 tokens

Claude Opus 4.7 is the next generation of Anthropic's Opus family, built for long-running, asynchronous agents. Building on the coding and agentic strengths of Opus 4.6, it delivers stronger performance on complex, multi-step tasks and more reliable agentic execution across extended workflows.

Read more

DeepSeek

DeepSeek V3 0324

Released
Invalid Date
Parameters
671000000000 B
Context
128,000 tokens

DeepSeek V3 0324 is a cost-effective MoE model with 671B parameters. It excels in coding and problem-solving, offering a budget-friendly alternative with strong performance.

Read more

DeepSeek R1 0528

Thinking Mode
Released
Invalid Date
Parameters
671000000000 B
Context
163,840 tokens

DeepSeek R1 0528 is an open-source model with 671B parameters (37B active). It offers performance on par with proprietary reasoning models, featuring fully open reasoning tokens.

Read more

DeepSeek R1

Thinking Mode
Released
Jan 20, 2025
Parameters
671 B
Context
128,000 tokens

Advanced DeepSeek reasoning model with RL training, comparable to OpenAI o1 in performance

Read more

DeepSeek V3

Released
Dec 26, 2024
Parameters
671 B
Context
128,000 tokens

Efficient MoE model with 671B parameters trained with FP8, achieving strong benchmark results

Read more

Deepseek v3.2 exp

Released
Invalid Date
Parameters
685000000000 B
Context
163,840 tokens

DeepSeek V3.2 Exp is an experimental model featuring DeepSeek Sparse Attention (DSA) for high efficiency. It delivers long-context handling up to 128k tokens with reduced inference costs.

Read more

Deepseek v3.1

Released
Invalid Date
Parameters
671000000000 B
Context
163,840 tokens

DeepSeek-V3.1 is a hybrid reasoning model (671B params, 37B active) supporting thinking and non-thinking modes. It improves on V3 with better tool use, code generation, and reasoning efficiency.

Read more

Deepseek v3.2

Thinking Mode
Released
Jan 12, 2025
Parameters
68500000000 B
Context
163,840 tokens

DeepSeek V3.2 is a 685B parameter MoE model leveraging DeepSeek Sparse Attention (DSA). It excels in complex mathematical reasoning and programming competitions, featuring integrated tool-use thinking modes.

Read more

DeepSeek 3.2 Speciale

Thinking Mode
Released
Jan 12, 2025
Parameters
685000000000 B
Context
163,840 tokens

DeepSeek-V3.2-Speciale is a high-compute variant of DeepSeek-V3.2 optimized for maximum reasoning and agentic performance

Read more

Google

Gemini 2.5 flash

Thinking Mode
Released
Invalid Date
Parameters
N/A
Context
1,048,576 tokens

Gemini 2.5 Flash is Google's workhorse model for high-frequency tasks. It features a 1M context window, optimized for speed and efficiency in reasoning and multimodal processing.

Read more

Gemini 2.5 flash lite

Thinking Mode
Released
Invalid Date
Parameters
N/A
Context
1,000,000 tokens

Gemini 2.5 Flash-Lite is a lightweight reasoning model optimized for ultra-low latency. It offers a 1M context window and is designed for cost-effective, high-throughput applications.

Read more

Gemini 2.5 pro

Thinking Mode
Released
Jan 9, 2025
Parameters
N/A
Context
1,000,000 tokens

Gemini 2.5 Pro is Google's best reasoning model, featuring a 1M token context window. It uses a sparse MoE architecture to excel in complex reasoning, coding, and multimodal tasks.

Read more

Gemma 3 27b it

Released
Dec 3, 2025
Parameters
27000000000 B
Context
131,072 tokens

Gemma 3 27B is an open-source multimodal model by Google. It supports vision-language inputs, 128k context, and offers improved math and reasoning capabilities.

Read more

Gemini 2.0 Flash

Released
Dec 11, 2024
Parameters
N/A
Context
1,000,000 tokens

Specific Gemini 2.0 Flash version with stable performance and consistent behavior

Read more

Gemini 2.5 Pro Preview

Thinking Mode
Released
Mar 25, 2025
Parameters
N/A
Context
1,000,000 tokens

Preview version of Gemini 2.5 Pro with advanced reasoning capabilities released in March 2025

Read more

Gemini 3 pro preview

Thinking Mode
Released
Invalid Date
Parameters
N/A
Context
1,048,576 tokens

Gemini 3 Pro Preview is Google's flagship frontier model. It offers high-precision multimodal reasoning across text, audio, video, and code, with a 1M token context.

Read more

Gemini 3 Flash Preview

Thinking Mode
Released
Invalid Date
Parameters
N/A
Context
1,048,576 tokens

Gemini 3 Flash Preview is a highly efficient model delivering Gemini 3 Pro-level reasoning and near real-time agentic tool orchestration with significantly lower latency and cost.

Read more

Gemini 3.1 Flash Lite Preview

Thinking Mode
Released
Mar 3, 2026
Parameters
N/A
Context
1,048,576 tokens

Gemini 3.1 Flash Lite Preview is an ultra-efficient, high-volume workhorse model featuring a 1M context window and 2.5x faster time-to-first-token than previous generations.

Read more

Gemma 4 31B IT

Thinking Mode
Released
Invalid Date
Parameters
31000000000 B
Context
262,144 tokens

Gemma 4 31B IT is an open-weights dense multimodal model from Google DeepMind featuring a 256K context window, native video/audio processing, and advanced configurable reasoning capabilities under an Apache 2.0 license.

Read more

Gemma 4 26B A4B IT

Thinking Mode
Released
Invalid Date
Parameters
26000000000 B
Context
262,144 tokens

Gemma 4 26B A4B IT is a latency-optimized open-weights MoE model activating only 3.8B parameters per token. It delivers near-31B dense quality while preserving hardware constraints for edge and enterprise deployments.

Read more

Gemini 3.1 Pro Preview

Thinking Mode
Released
Invalid Date
Parameters
N/A
Context
1,048,576 tokens

Gemini 3.1 Pro Preview is Google's flagship reasoning model featuring a 1M token context, three-tier adjustable reasoning depth controls, and unparalleled complex problem-solving capabilities across multimodal inputs.

Read more

Meta

Llama 4 Scout 17B 16E Instruct

Released
Apr 1, 2025
Parameters
170 B
Context
128,000 tokens

Llama 4 Scout variant with 17B parameters and mixture-of-experts architecture for efficiency

Read more

Llama 3.3 70B Instruct

Released
Dec 6, 2024
Parameters
700 B
Context
128,000 tokens

Llama 3.3 model with 70B parameters offering improved performance over 3.1 version

Read more

Llama 4 Maverick 17B Instruct

Released
Apr 1, 2025
Parameters
170 B
Context
128,000 tokens

FP8-quantized 17B Llama 4 Maverick model optimized for deployment efficiency and speed

Read more

Llama 4 scout

Released
May 4, 2025
Parameters
109000000000 B
Context
327,680 tokens

Llama 4 Scout is a 109B parameter (17B active) MoE model. It is designed for efficiency and visual reasoning with a 328k context.

Read more

Microsoft

Phi 4

Thinking Mode
Released
Oct 1, 2025
Parameters
14000000000 B
Context
16,384 tokens

Phi-4 is a 14B parameter model by Microsoft. It excels in complex reasoning and limited memory environments, trained on high-quality synthetic data.

Read more

Phi 3 mini 128k instruct

Released
Invalid Date
Parameters
3800000000 B
Context
128,000 tokens

Phi-3 Mini is a 3.8B lightweight model with 128k context. It offers state-of-the-art performance for its size, suitable for edge devices.

Read more

Minimax

Minimax m2

Thinking Mode
Released
Invalid Date
Parameters
230000000000 B
Context
204,800 tokens

MiniMax M2 is a 230B (10B active) MoE model. It is highly efficient, designed for coding and agentic workflows with low latency.

Read more

MiniMax M2.7

Thinking Mode
Released
Nov 4, 2026
Parameters
230000000000 B
Context
204,800 tokens

MiniMax M2.7 is a 230B parameter MoE model (10B active) utilizing RoPE and QK RMSNorm. It features recursive self-optimization, updating its own memory to execute highly complex software engineering tasks.

Read more

MiniMax M2.5

Thinking Mode
Released
Invalid Date
Parameters
230000000000 B
Context
196,608 tokens

MiniMax M2.5 is a hyper-efficient 230B MoE model (10B active) trained via large-scale RL in 200,000+ environments. It excels in office productivity, outputting at 100 tokens/sec at unprecedented cost efficiency.

Read more

Mistral AI

Mistral Large 2411

Released
Nov 1, 2024
Parameters
123 B
Context
128,000 tokens

Updated Mistral Large from November 2024 with improved performance and capabilities

Read more

Magistral small 2506

Released
Oct 6, 2025
Parameters
24000000000 B
Context
40,000 tokens

Magistral Small 2506 is a 24B parameter model by Mistral AI. It is optimized for multilingual reasoning and instruction following.

Read more

Mistral Small 24B Instruct

Released
Jan 1, 2025
Parameters
240 B
Context
32,768 tokens

Compact 24B parameter Mistral model optimized for cost-effective instruction following

Read more

Magistral Medium 2506

Released
N/A
Parameters
N/A
Context
N/A

No description available.

Read more

Mistral Small 3.2 24B Instruct

Released
N/A
Parameters
N/A
Context
N/A

No description available.

Read more

Mistral large 2512

Released
Jan 12, 2025
Parameters
675000000000 B
Context
262,144 tokens

Mistral Large 3 is a massive open-weight granular MoE model featuring 675B total parameters (41B active). It offers top-tier reliability for production-grade assistants and long-context code comprehension.

Read more

Ministral 8b 2512

Released
Invalid Date
Parameters
N/A
Context
262,144 tokens

A balanced model in the Ministral 3 family, Ministral 3 8B is a powerful, efficient tiny language model with vision capabilities.

Read more

Mistral medium 3.1

Released
Feb 12, 2024
Parameters
N/A
Context
131,072 tokens

Mistral Medium 3.1 is an updated version of Mistral Medium 3, which is a high-performance enterprise-grade language model designed to deliver frontier-level capabilities at significantly reduced operational cost.

Read more

Mistral Small 4

Thinking Mode
Released
Invalid Date
Parameters
119000000000 B
Context
262,144 tokens

Mistral Small 4 unifies Instruct, Magistral, and Devstral capabilities into a single 119B MoE architecture activating just 6.5B parameters. It offers configurable reasoning effort and native multimodality.

Read more

Moonshot AI

Kimi K2 Instruct

Released
Jan 7, 2025
Parameters
1000000000000 B
Context
256,000 tokens

Kimi K2 Instruct is a large open-weight model (1T params, 32B active) by Moonshot AI. It offers strong performance in instruction following and general tasks.

Read more

Kimi K2 0905

Released
N/A
Parameters
N/A
Context
N/A

No description available.

Read more

Kimi k2 thinking

Thinking Mode
Released
Jan 7, 2025
Parameters
1000000000000 B
Context
256,000 tokens

Kimi K2 Thinking is a reasoning variant capable of autonomous long-horizon tasks. It can execute hundreds of sequential tool calls.

Read more

Kimi K2.5

Thinking Mode
Released
Invalid Date
Parameters
1000000000000 B
Context
262,144 tokens

Kimi K2.5 is a 1-trillion parameter open-weight MoE (32B active). It features a native MoonViT encoder and self-directed Agent Swarm technology capable of orchestrating 100 sub-agents in parallel.

Read more

Nvidia

Llama 3.1 Nemotron Ultra 253B v1

Released
Nov 1, 2024
Parameters
253 B
Context
128,000 tokens

NVIDIA-tuned 253B Llama 3.1 model optimized for enterprise applications and instruction following

Read more

Llama 3.3 Nemotron Super 49B v1

Released
Nov 22, 2024
Parameters
490 B
Context
128,000 tokens

NVIDIA optimized 49B Llama 3.3 model providing excellent performance-to-size ratio

Read more

Llama 4 Maverick

Released
May 4, 2025
Parameters
400000000000 B
Context
1,000,000 tokens

Llama 4 Maverick is Meta's natively multimodal 400B MoE model (17B active). It utilizes early fusion of text and vision tokens and was codistilled using online RL to master complex visual-reasoning tasks.

Read more

Llama 3.1 Nemotron 70B Instruct

Released
Nov 1, 2024
Parameters
700 B
Context
128,000 tokens

NVIDIA tuned 70B Llama 3.1 model with enhanced instruction following and helpfulness

Read more

Llama 3.3 nemotron super 49b v1.5

Thinking Mode
Released
Invalid Date
Parameters
49000000000 B
Context
131,072 tokens

Llama 3.3 Nemotron Super 49B is a reasoning model derived from Llama 3.3 70B. It is post-trained for agentic workflows, RAG, and tool calling.

Read more

Nemotron nano 9b v2

Released
May 9, 2025
Parameters
9000000000 B
Context
131,072 tokens

Nemotron Nano 9B v2 is a compact 9B model by NVIDIA. It is a unified model for reasoning and non-reasoning tasks, trained from scratch.

Read more

Llama 3.1 nemotron ultra 253b v1

Thinking Mode
Released
Jul 4, 2025
Parameters
253000000000 B
Context
131,072 tokens

Llama 3.1 Nemotron Ultra 253B is a derivative of Llama 3.1 405B, optimized for reasoning and chat. It offers a balance of accuracy and efficiency.

Read more

Nemotron 3 Nano 30B A3B

Thinking Mode
Released
Invalid Date
Parameters
31600000000 B
Context
262,000 tokens

Nemotron 3 Nano 30B A3B is a highly efficient 31.6B total parameter MoE model activating only 3.2B parameters. It offers a 1M token context window and up to 3.3x higher throughput for agentic systems.

Read more

Nemotron 3 Super 120B A12B

Thinking Mode
Released
Nov 3, 2026
Parameters
120000000000 B
Context
262,000 tokens

Nemotron 3 Super is a 120B parameter hybrid Mamba-Transformer model (12B active). It utilizes LatentMoE and Multi-Token Prediction (MTP) to maximize compute efficiency for complex RAG and IT ticket automation.

Read more

Olmo

Olmo 3.1 32b Think

Thinking Mode
Released
Invalid Date
Parameters
32000000000 B
Context
65,536 tokens

A large-scale, 32-billion-parameter model designed for deep reasoning, complex multi-step logic, and advanced instruction following.

Read more

OpenAI

GPT-4o Mini

Released
Jul 18, 2024
Parameters
N/A
Context
128,000 tokens

Smaller, faster, and more affordable version of GPT-4o, ideal for high-volume applications requiring good intelligence

Read more

GPT-4.1

Released
Jan 15, 2025
Parameters
N/A
Context
128,000 tokens

Enhanced iteration of GPT-4 with improved reasoning, coding, and multimodal capabilities

Read more

O3

Thinking Mode
Released
Dec 20, 2024
Parameters
N/A
Context
200,000 tokens

Most advanced OpenAI reasoning model with multimodal capabilities and agentic tool use for complex analysis

Read more

O4 Mini

Thinking Mode
Released
Apr 16, 2025
Parameters
N/A
Context
200,000 tokens

Lightweight reasoning model balancing speed and intelligence for everyday complex tasks

Read more

Gpt 5

Thinking Mode
Released
Jul 8, 2025
Parameters
N/A
Context
400,000 tokens

GPT-5 is OpenAI's latest flagship model, designed as an adaptive system. It features dynamic reasoning depth, 400k context, and improvements in accuracy and multimodal integration.

Read more

Gpt 5 mini

Thinking Mode
Released
Jul 8, 2025
Parameters
N/A
Context
400,000 tokens

GPT-5-Nano is the smallest and fastest variant in the GPT-5 system, optimized for developer tools, rapid interactions, and ultra-low latency environments.

Read more

Gpt 5 nano

Thinking Mode
Released
Jul 8, 2025
Parameters
N/A
Context
400,000 tokens

GPT-5 Mini is a compact version of GPT-5 for lightweight reasoning. It offers low latency and cost, suitable for high-frequency tasks.

Read more

Gpt oss 120b

Thinking Mode
Released
May 8, 2025
Parameters
117000000000 B
Context
131,000 tokens

GPT-OSS-120B is an open-weight MoE model from OpenAI containing 116.8B total parameters (5.1B active). Licensed under Apache 2.0, it is post-trained with MXFP4 quantization to run inference efficiently on a single 80GB GPU.

Read more

GPT-4.1 Mini

Released
Jan 15, 2025
Parameters
N/A
Context
128,000 tokens

Compact GPT-4.1 variant optimized for efficiency while maintaining strong performance

Read more

o3-mini

Thinking Mode
Released
Jan 31, 2025
Parameters
N/A
Context
200,000 tokens

January 2025 release of o3-mini with enhanced STEM capabilities and developer features

Read more

o4-mini

Thinking Mode
Released
Apr 16, 2025
Parameters
N/A
Context
200,000 tokens

April 2025 o4-mini release with improved reasoning efficiency and balanced performance

Read more

Gpt 5.1

Thinking Mode
Released
Invalid Date
Parameters
N/A
Context
400,000 tokens

GPT-5.1 offers stronger general-purpose reasoning and instruction adherence than GPT-5. It features adaptive computation and a natural conversational style.

Read more

Gpt 5.2 Pro

Thinking Mode
Released
Oct 12, 2025
Parameters
N/A
Context
400,000 tokens

GPT-5.2 Pro is OpenAI’s most advanced model, offering major improvements in agentic coding and long context performance over GPT-5 Pro. It is optimized for complex tasks that require step-by-step reasoning, instruction following, and accuracy in high-stakes use cases.

Read more

Gpt 5.2

Thinking Mode
Released
Oct 12, 2025
Parameters
N/A
Context
400,000 tokens

GPT-5.2 is the latest frontier-grade model in the GPT-5 series, offering stronger agentic and long context perfomance compared to GPT-5.1. It uses adaptive reasoning to allocate computation dynamically.

Read more

Gpt oss 20b

Thinking Mode
Released
May 8, 2025
Parameters
21000000000 B
Context
131,000 tokens

GPT-OSS-20B is a compact open-weight MoE model from OpenAI containing 21B parameters (3.6B active). It uses grouped multi-query attention for low-latency inference on consumer hardware under an Apache 2.0 license.

Read more

GPT-5.4 Mini (xhigh)

Thinking Mode
Released
Invalid Date
Parameters
N/A
Context
400,000 tokens

GPT-5.4 Mini is OpenAI's highly efficient small model offering 2x faster execution than GPT-5 Mini. It supports a 400K context window and achieves 54.4% on SWE-Bench Pro, ideal for responsive coding assistants.

Read more

GPT-5.4 Nano (xhigh)

Thinking Mode
Released
Invalid Date
Parameters
N/A
Context
400,000 tokens

GPT-5.4 Nano is OpenAI's most cost-effective tier, optimized for massive-scale classification and supporting subagents. It features a 400K context window at just $0.20 per million input tokens.

Read more

GPT-5.4 (xhigh)

Thinking Mode
Released
May 3, 2026
Parameters
N/A
Context
1,048,576 tokens

GPT-5.4 is OpenAI's flagship frontier model, natively integrating frontier coding (57.7% SWE-bench Pro), state-of-the-art computer-use abilities, and deep agentic workflows over a 1.05M token context window.

Read more

xAI

Grok 3 mini

Thinking Mode
Released
Oct 6, 2025
Parameters
N/A
Context
131,072 tokens

Grok 3 Mini is a lightweight, fast reasoning model from xAI. It is designed for logic-based tasks and offers accessible thinking traces.

Read more

Grok 4

Thinking Mode
Released
Sep 7, 2025
Parameters
314000000000 B
Context
256,000 tokens

Grok 4 is xAI's general-purpose reasoning model with 314B parameters (MoE). It features real-time data integration and strong performance in general tasks.

Read more

Grok-2

Released
Dec 12, 2024
Parameters
N/A
Context
131,072 tokens

Grok 2 version from December 2024 with incremental improvements and optimizations

Read more

Grok-3 Beta

Thinking Mode
Released
Mar 1, 2025
Parameters
N/A
Context
131,072 tokens

Beta version of Grok 3 with extended reasoning for complex problem-solving tasks

Read more

Grok 4.1 fast

Thinking Mode
Released
Invalid Date
Parameters
N/A
Context
2,000,000 tokens

Grok 4.1 Fast provides an immense 2M token context window. It is specifically optimized for high-speed document retrieval, customer support automation, and processing massive data pipelines.

Read more

Grok 4.1 fast thinking

Thinking Mode
Released
Invalid Date
Parameters
N/A
Context
2,000,000 tokens

Grok 4.1 Fast Thinking is the reasoning-enabled variant of Grok 4.1 Fast. It provides extended thought processes for complex problem-solving within a 2M context.

Read more

Grok 4.20

Thinking Mode
Released
Mar 3, 2026
Parameters
6000000000000 B
Context
2,000,000 tokens

Grok 4.20 is a revolutionary ~6-trillion parameter MoE model that runs four specialized agents simultaneously on a shared backbone. It utilizes persona adapters to coordinate multi-agent workflows within a 2M token context.

Read more

Xiaomi

Mimo V2 Pro

Thinking Mode
Released
Invalid Date
Parameters
1000000000000 B
Context
1,000,000 tokens

MiMo V2 Pro is Xiaomi's flagship ~1-Trillion parameter MoE (42B active) agentic engine. Achieving an Elo of 1426 on GDPval-AA, it is designed for extreme reliability in long-horizon autonomous task execution.

Read more

Zhipu AI

GLM 4.5

Thinking Mode
Released
Invalid Date
Parameters
355000000000 B
Context
128,000 tokens

GLM-4.5 is an open-weight model with 355B parameters (32B active). It uses MoE architecture to deliver state-of-the-art performance in reasoning, coding, and multimodal tasks.

Read more

GLM 4.5 Air

Thinking Mode
Released
Invalid Date
Parameters
106000000000 B
Context
128,000 tokens

GLM-4.5 Air is an efficient MoE model with 106B parameters (12B active). It is optimized for agentic applications, tool use, and speed.

Read more

GLM 4.6

Thinking Mode
Released
Invalid Date
Parameters
355000000000 B
Context
202,752 tokens

GLM-4.6 is an open-weight model with 355B parameters (32B active). It uses MoE architecture to deliver state-of-the-art performance in reasoning, coding, and multimodal tasks.

Read more

GLM 5.1

Thinking Mode
Released
Invalid Date
Parameters
744000000000 B
Context
202,752 tokens

GLM-5.1 is an open-weight 744B MoE model (40B active) released under the MIT license. Integrating DeepSeek Sparse Attention, it matches proprietary frontier models on SWE-Bench Pro (58.4%).

Read more

GLM 4.7

Thinking Mode
Released
Invalid Date
Parameters
358000000000 B
Context
202,752 tokens

GLM-4.7 is a highly stable 358B parameter model optimized for coding and UI generation. It utilizes Interleaved Thinking and Turn-level Thinking for reliable execution of complex mathematical tasks.

Read more
Models - AutoBench