Skip to content

Free tool

AI Model Comparison (2026)

Every major 2026 LLM side by side — context window, API price per 1M tokens, modality, and whether the weights are open. Filter by provider or open-weights, sort by price or context. Verified and cross-checked; long-context surcharges disclosed.

ModelContextIn $/1MOut $/1MWeightsModality
Qwen 3.7 MaxAlibaba · 2026-05Agent-first flagship; strong tool-use + multilingual (esp. Chinese) reasoning1M$2.5*$7.5ProprietaryMultimodal
Claude Fable 5Anthropic · 2026-06-09Anthropic's most capable widely-released model; top-end reasoning + agentic work1M$10*$50ProprietaryText + image in
Claude Haiku 4.5Anthropic · 2025-10Fastest Claude with near-frontier intelligence at low cost200K$1$5ProprietaryText + image in
Claude Opus 4.8Anthropic · 2026Most capable Opus-tier model for complex reasoning + long-horizon agentic coding1M$5$25ProprietaryText + image in
Claude Sonnet 4.6Anthropic · 2026Best balance of speed and intelligence; strong agentic + coding workhorse1M$3*$15ProprietaryText + image in
DeepSeek V4 (Pro)DeepSeek · 2026-04-24Open-weights MoE (~32–37B active) frontier value; sparse attention1M$1.74*$3.48OpenMultimodal
Gemini 3.1 ProGoogle · 2026-02-19Frontier long-context multimodal reasoning; largest practical context in class1M$2*$12ProprietaryMultimodal (text/image/audio/video)
Gemini 3.5 FlashGoogle · 2026-05-20Flagship-class fast model; 280+ tok/s, default in Gemini app + Search AI Mode1M$1.5$9ProprietaryMultimodal (text/image/audio/video)
Llama 4 MaverickMeta · 2025-04-05Open-weights MoE; cheapest frontier-ish option, huge ecosystem + fine-tunability1M$0.22*$0.85OpenMultimodal (text + image)
Mistral Large 3Mistral AI · 2025-12-01Open Apache-2.0 European flagship; self-hostable, strong multilingual + coding256K$2$6OpenText + image
Kimi K2 ThinkingMoonshot AI · 2025-11Open 1T-param MoE (32B active) built for long-horizon agentic reasoning256K$0.6$2.5OpenText (agentic)
GPT-5.4 miniOpenAI · 2026Cost-efficient mid-tier for high-volume tasks with solid reasoning400K$0.75$4.5ProprietaryText + image in
GPT-5.4 nanoOpenAI · 2026Cheapest OpenAI tier for simple, latency-sensitive, high-volume calls400K$0.2$1.25ProprietaryText + image in
GPT-5.5OpenAI · 2026-04-23OpenAI flagship; first OpenAI model with a 1M-token API context window1.05M$5*$30ProprietaryText + image in
Grok 4.3xAI · 2026xAI's fastest + most intelligent GA model; native real-time X / web search1M$1.25*$2.5ProprietaryText + image in
GLM-5Zhipu / Z.ai · 2026-02-11Open MIT-licensed frontier model; very strong coding + agentic value200K$0.6$1.92OpenMultimodal

Standard pay-as-you-go list prices, USD per 1M tokens, as of 21 Jun 2026 — for prompts up to ~200K tokens. * = tiered/long-context or promo pricing applies (hover the input price). Open-weights API prices are representative third-party hosted rates; self-hosting is free. This space moves weekly — confirm with each provider before relying on a figure.

Frequently asked

Which 2026 model has the cheapest API pricing?
Among frontier-class options, third-party-hosted open-weights models are cheapest — Llama 4 Maverick runs about $0.22 input / $0.85 output per 1M tokens. Among major proprietary APIs, OpenAI's GPT-5.4 nano ($0.20 / $1.25) and Google's smaller Flash tiers are the lowest. DeepSeek V4 is unusually cheap for its capability, often discounted well below its $1.74 / $3.48 list rate.
Which model has the largest context window?
Google's Gemini 3.1 Pro leads the major APIs with a 1M+ token window (up to ~2M reported), and Meta's Llama 4 Scout sibling advertises up to 10M. Among the flagships here, Gemini 3.1 Pro, GPT-5.5 (~1.05M), Claude Opus 4.8 / Sonnet 4.6 / Fable 5 (1M), Grok 4.3 (1M), and DeepSeek V4 (1M) all reach roughly 1M tokens.
What are the best open-weights models in mid-2026?
The strongest open-weights options are DeepSeek V4 (Apache-2.0), Meta's Llama 4 Maverick, Moonshot's Kimi K2 Thinking, Zhipu's MIT-licensed GLM-5, and Mistral Large 3 (Apache-2.0). Several rival proprietary frontier models on coding and agentic tasks, and all can be self-hosted or run via many providers.
Why are some prices shown with an asterisk and a footnote?
Several flagships use context-tiered pricing: the listed figure is the rate for prompts up to ~200K tokens, and the footnote shows the higher long-context rate. For example, Gemini 3.1 Pro jumps from $2 / $12 to $4 / $18 above 200K tokens, and GPT-5.5 charges 2× input / 1.5× output beyond 272K tokens. Claude Opus 4.8, by contrast, has no long-context premium.
Is Grok 5 available yet?
Not as a generally available API as of June 2026. Grok 5 has been widely discussed (rumored ~6T params, 1.5M context) but xAI had not published an official release or pricing at the time of writing, so this table uses Grok 4.3, the current GA flagship.