53
Total Models
23
Providers
16
Open Source/Weight
37
Commercial
47
API Available
85
Avg. Benchmark
Top 10 Models by Benchmark Score
Models by Category
Claude 4 Opus
Anthropic
Anthropic's most capable model, excelling at complex reasoning, coding, math, and creative writing. Claude 4 Opus delivers near-perfect performance on graduate-level reasoning benchmarks and is the gold standard for agentic coding tasks.
Context
200K tokens
Params
Unknown
Pricing
$15/1M input, $75/1M output
Benchmark Score
o3
OpenAI
OpenAI's most advanced reasoning model, using extended thinking to solve complex math, science, and coding problems. o3 achieves state-of-the-art results on competition-level mathematics and PhD-level science benchmarks.
Context
200K tokens
Params
Unknown
Pricing
$10/1M input, $40/1M output
Benchmark Score
Claude Code
Anthropic
Anthropic's agentic coding tool that operates directly in your terminal. Claude Code can read entire codebases, write and edit files, run commands, manage git workflows, and execute multi-step engineering tasks autonomously.
Context
200K tokens
Params
Uses Claude 4 Opus/Sonnet
Pricing
Included with Claude Pro ($2...
Benchmark Score
ElevenLabs
ElevenLabs
The industry-leading text-to-speech platform producing hyper-realistic, emotionally expressive voice synthesis. ElevenLabs supports 32 languages, voice cloning, and real-time streaming, making it the gold standard for AI voice generation.
Context
N/A
Params
Unknown
Pricing
$5-330/month subscription or...
Benchmark Score
Grok 3
xAI
xAI's flagship model trained on one of the largest GPU clusters ever assembled. Grok 3 achieves top-tier performance on math, science, and coding benchmarks, with real-time access to X (Twitter) data for up-to-date information.
Context
128K tokens
Params
Unknown
Pricing
$3/1M input, $15/1M output
Benchmark Score
Claude 4 Opus (Multimodal)
Anthropic
Claude 4 Opus with vision capabilities for understanding images, charts, diagrams, and screenshots. While primarily text-focused, Claude's visual understanding excels at document analysis, code screenshot interpretation, and data visualization comprehension.
Context
200K tokens
Params
Unknown
Pricing
$15/1M input, $75/1M output
Benchmark Score
Claude 4 Sonnet
Anthropic
The ideal balance of speed, cost, and intelligence. Claude 4 Sonnet matches or exceeds many competitors' flagship models while being significantly faster and cheaper than Opus. Excellent for production workloads.
Context
200K tokens
Params
Unknown
Pricing
$3/1M input, $15/1M output
Benchmark Score
Midjourney v6.1
Midjourney
The leading AI image generation model known for producing stunning, artistic, and photorealistic visuals. Midjourney v6.1 delivers exceptional aesthetic quality and is the go-to tool for professional creatives and designers.
Context
N/A
Params
Unknown
Pricing
$10-60/month subscription
Benchmark Score
text-embedding-3-large
OpenAI
OpenAI's most capable text embedding model for converting text into high-dimensional vectors. With 3,072 dimensions and Matryoshka Representation Learning, it supports flexible dimension reduction while maintaining strong semantic capture.
Context
8,191 tokens
Params
Unknown
Pricing
$0.13/1M tokens
Benchmark Score
Gemini 2.0 Pro
Google's most capable model in the Gemini 2.0 family, delivering strong performance across coding, math, and multimodal tasks. Features native tool use, grounding with Google Search, and a massive context window for complex workflows.
Context
2M tokens
Params
Unknown
Pricing
$1.25/1M input, $10/1M output
Benchmark Score
DeepSeek R1
DeepSeek
DeepSeek's reasoning-focused model that matches OpenAI o1's performance through innovative reinforcement learning. DeepSeek R1 demonstrates extended chain-of-thought reasoning with full transparency into its thinking process.
Context
128K tokens
Params
671B (37B active)
Pricing
$0.55/1M input, $2.19/1M out...
Benchmark Score
Gemini 2.0 Pro (Multimodal)
Google's natively multimodal model that understands and generates across text, images, audio, and video. Gemini 2.0 Pro's multimodal capabilities are built from the ground up, with a massive 2M token context window for processing long videos and documents.
Context
2M tokens
Params
Unknown
Pricing
$1.25/1M input, $10/1M output
Benchmark Score
Voyage AI
Voyage AI
Specialized embedding models optimized for code and technical content retrieval. Voyage AI offers domain-specific models (voyage-code-3, voyage-3) that consistently rank at the top of code retrieval benchmarks.
Context
16K tokens
Params
Unknown
Pricing
$0.06/1M tokens (voyage-3) -...
Benchmark Score
GPT-4.5
OpenAI
OpenAI's largest and most knowledgeable model, focused on unsupervised learning improvements. GPT-4.5 delivers reduced hallucinations, better world knowledge, and improved emotional intelligence compared to GPT-4o.
Context
128K tokens
Params
Unknown
Pricing
$75/1M input, $150/1M output
Benchmark Score
Veo 2
Google's most advanced video generation model, capable of producing 4K resolution videos with cinematic quality. Veo 2 understands physics, lighting, and human motion, with support for various cinematic styles and camera movements.
Context
N/A
Params
Unknown
Pricing
Available through VideoFX an...
Benchmark Score
Whisper v3
OpenAI
OpenAI's open-source speech recognition model that delivers near-human accuracy across 100+ languages. Whisper v3 handles transcription, translation, and language detection with remarkable robustness to accents, background noise, and technical jargon.
Context
30 seconds audio chunks
Params
1.5B
Pricing
$0.006/minute via API or Fre...
Benchmark Score
GPT-4o (Multimodal)
OpenAI
The multimodal facet of GPT-4o, processing text, images, audio, and video natively within a single model. GPT-4o's multimodal capabilities enable seamless cross-modal reasoning, image understanding, and real-time voice conversations.
Context
128K tokens
Params
Unknown
Pricing
$2.50/1M input, $10/1M output
Benchmark Score
Cohere Embed v3
Cohere
Cohere's leading embedding model offering state-of-the-art performance for search and retrieval. Embed v3 supports 100+ languages, compression to binary/int8 formats, and separate search/document embedding types for optimal retrieval.
Context
512 tokens
Params
Unknown
Pricing
$0.10/1M tokens
Benchmark Score
GPT-4o
OpenAI
OpenAI's flagship multimodal model combining text, vision, and audio capabilities. GPT-4o (omni) processes and generates text, images, and audio natively, delivering fast, intelligent responses across modalities.
Context
128K tokens
Params
Unknown
Pricing
$2.50/1M input, $10/1M output
Benchmark Score
Mistral Large
Mistral
Mistral's flagship commercial model delivering top-tier performance on reasoning, coding, and multilingual tasks. Mistral Large competes directly with GPT-4o and Claude 3.5 Sonnet, with particularly strong performance on European languages.
Context
128K tokens
Params
Unknown
Pricing
$2/1M input, $6/1M output
Benchmark Score
DeepSeek V3
DeepSeek
A groundbreaking open-source model that rivals closed-source giants at a fraction of the training cost. DeepSeek V3 uses a mixture-of-experts architecture with 671B total parameters and achieves remarkable efficiency through innovative training techniques.
Context
128K tokens
Params
671B (37B active)
Pricing
$0.27/1M input, $1.10/1M out...
Benchmark Score
Flux 1.1 Pro
Black Forest Labs
A state-of-the-art image generation model from the creators of Stable Diffusion. Flux 1.1 Pro produces exceptionally detailed and coherent images with strong prompt adherence and photorealistic quality, quickly becoming a favorite among AI artists.
Context
N/A
Params
12B
Pricing
$0.04/image via API
Benchmark Score
Sora
OpenAI
OpenAI's text-to-video model capable of generating realistic, high-quality videos up to 1 minute long. Sora understands physical world dynamics, object permanence, and cinematic techniques, producing coherent videos from text prompts.
Context
N/A
Params
Unknown
Pricing
Included with ChatGPT Plus (...
Benchmark Score
Suno v4
Suno
A revolutionary AI music generation platform that creates full songs with vocals, instruments, and lyrics from text prompts. Suno v4 produces remarkably human-sounding music across genres, from pop and rock to classical and hip-hop.
Context
N/A
Params
Unknown
Pricing
$10-30/month subscription
Benchmark Score
o3-mini
OpenAI
A smaller, faster version of o3 that still excels at reasoning tasks. o3-mini provides strong mathematical and coding capabilities at a more accessible price point, with adjustable reasoning effort levels.
Context
200K tokens
Params
Unknown
Pricing
$1.10/1M input, $4.40/1M out...
Benchmark Score
Llama 3.3 70B
Meta
Meta's latest open-weight model offering performance comparable to the larger Llama 3.1 405B at a fraction of the compute cost. Llama 3.3 70B excels at instruction following, coding, and reasoning tasks.
Context
128K tokens
Params
70B
Pricing
Free (self-hosted) or ~$0.50...
Benchmark Score
Imagen 3
Google's highest quality image generation model, available through Gemini and the Imagen API. Imagen 3 excels at photorealism, artistic styles, and accurate text rendering, with strong safety measures built in.
Context
N/A
Params
Unknown
Pricing
$0.03/image via Vertex AI
Benchmark Score
Gemini 1.5 Pro
The previous generation flagship from Google, still widely used for its reliable performance and massive context window. Gemini 1.5 Pro handles long-form content, code analysis, and multimodal tasks effectively.
Context
2M tokens
Params
Unknown
Pricing
$1.25/1M input, $5/1M output
Benchmark Score
Codex CLI
OpenAI
OpenAI's open-source command-line coding agent that uses o3 and o4-mini models. Codex CLI can read files, write code, execute commands, and iterate on tasks in your terminal with configurable autonomy levels.
Context
200K tokens
Params
Uses o3/o4-mini
Pricing
Free (open-source), uses Ope...
Benchmark Score
DALL-E 3
OpenAI
OpenAI's latest image generation model, tightly integrated with ChatGPT. DALL-E 3 excels at following complex prompts accurately, rendering text in images, and producing coherent compositions without the need for prompt engineering.
Context
N/A
Params
Unknown
Pricing
$0.040/image (standard) - $0...
Benchmark Score
Runway Gen-3 Alpha
Runway
Runway's latest text-to-video and image-to-video model used by professional filmmakers and studios. Gen-3 Alpha delivers impressive cinematic quality with strong temporal consistency, motion control, and artistic style preservation.
Context
N/A
Params
Unknown
Pricing
$12-76/month subscription
Benchmark Score
Udio
Udio
An AI music generation platform that rivals Suno in creating high-quality, genre-diverse music. Udio excels at producing musically coherent tracks with impressive vocal synthesis and instrumental arrangements from simple text descriptions.
Context
N/A
Params
Unknown
Pricing
$10-30/month subscription
Benchmark Score
Gemini 2.0 Flash
A fast, efficient model optimized for high-throughput applications. Gemini 2.0 Flash delivers impressive quality at very low latency and cost, with multimodal input support and native tool use capabilities.
Context
1M tokens
Params
Unknown
Pricing
$0.10/1M input, $0.40/1M out...
Benchmark Score
Qwen 2.5 72B
Alibaba
Alibaba's flagship open-weight model delivering competitive performance with commercial models. Qwen 2.5 72B excels at multilingual tasks with strong support for Chinese, English, and many other languages, plus solid coding and math capabilities.
Context
128K tokens
Params
72B
Pricing
Free (self-hosted) or ~$0.30...
Benchmark Score
Kling 1.6
Kuaishou
A video generation model from Chinese tech giant Kuaishou, known for producing high-quality, physics-aware videos. Kling 1.6 can generate videos up to 2 minutes long with impressive motion coherence and visual fidelity.
Context
N/A
Params
Unknown
Pricing
Credit-based pricing, free t...
Benchmark Score
Command R+
Cohere
Cohere's most powerful model, purpose-built for enterprise RAG applications, tool use, and complex agentic workflows. Command R+ excels at grounded generation with citations, making it ideal for business applications requiring verifiable outputs.
Context
128K tokens
Params
Unknown
Pricing
$2.50/1M input, $10/1M output
Benchmark Score
Ideogram 2.0
Ideogram
A specialized image generation model that leads the industry in text rendering within images. Ideogram 2.0 produces clean, accurate text in signs, logos, and typography-heavy images that other models struggle with.
Context
N/A
Params
Unknown
Pricing
$8-60/month subscription or ...
Benchmark Score
Claude 3.5 Haiku
Anthropic
The fastest model in Anthropic's lineup, designed for high-throughput, low-latency applications. Despite its speed, Haiku delivers impressive quality that rivals many larger models, making it ideal for real-time applications.
Context
200K tokens
Params
Unknown
Pricing
$0.80/1M input, $4/1M output
Benchmark Score
Mixtral 8x22B
Mistral
A powerful mixture-of-experts model that activates 39B parameters per forward pass from a total of 176B. Mixtral 8x22B delivers near-flagship performance while being open-weight, making it popular for self-hosted deployments.
Context
64K tokens
Params
176B (39B active)
Pricing
Free (self-hosted) or ~$0.60...
Benchmark Score
DeepSeek Coder V2
DeepSeek
An open-source code-focused model that achieves GPT-4 Turbo-level coding performance. DeepSeek Coder V2 uses a mixture-of-experts architecture for efficient inference and supports 338 programming languages.
Context
128K tokens
Params
236B (21B active)
Pricing
Free (open-source)
Benchmark Score
Nomic Embed
Nomic AI
A fully open-source text embedding model with published training code, data, and weights. Nomic Embed delivers competitive performance while being completely transparent and self-hostable, with a generous 8K token context window.
Context
8,192 tokens
Params
137M
Pricing
Free (open-source) or $0.10/...
Benchmark Score
Mistral Medium
Mistral
A balanced mid-tier model from Mistral offering strong performance at a moderate price point. Suitable for production workloads that need better quality than small models without the cost of flagship models.
Context
128K tokens
Params
Unknown
Pricing
$1/1M input, $3/1M output
Benchmark Score
Qwen 2.5 Coder
Alibaba
A code-specialized variant of Qwen 2.5 optimized for programming tasks. Available in multiple sizes (1.5B to 32B), it excels at code generation, completion, and debugging across many programming languages.
Context
128K tokens
Params
32B
Pricing
Free (self-hosted)
Benchmark Score
Stable Diffusion 3.5
Stability AI
The latest open-weight image generation model from Stability AI. SD 3.5 delivers high-quality images with improved text rendering, better anatomy, and multiple model sizes (Large, Medium, Turbo) for different use cases.
Context
N/A
Params
8B (Large), 2.6B (Medium)
Pricing
Free (self-hosted) or $0.035...
Benchmark Score
GPT-4o mini
OpenAI
A cost-efficient small model from OpenAI that punches above its weight class. GPT-4o mini offers strong performance at a fraction of the cost, making it ideal for high-volume applications and budget-conscious deployments.
Context
128K tokens
Params
Unknown
Pricing
$0.15/1M input, $0.60/1M out...
Benchmark Score
Pika 2.0
Pika
A consumer-friendly video generation platform that excels at short-form creative videos. Pika 2.0 features innovative effects like Pikaffects for adding dramatic transformations, explosions, and cinematic effects to videos.
Context
N/A
Params
Unknown
Pricing
$8-58/month subscription
Benchmark Score
Phi-4
Microsoft
Microsoft's small language model that punches far above its weight. Phi-4 achieves performance comparable to much larger models through high-quality training data and synthetic data techniques, making it ideal for on-device and edge scenarios.
Context
16K tokens
Params
14B
Pricing
Free (self-hosted)
Benchmark Score
Command R
Cohere
A cost-effective model from Cohere optimized for retrieval-augmented generation and conversational search. Command R provides solid RAG performance at a lower price point than Command R+.
Context
128K tokens
Params
Unknown
Pricing
$0.15/1M input, $0.60/1M out...
Benchmark Score
CodeLlama 70B
Meta
Meta's code-specialized large language model based on Llama 2, fine-tuned on 500B+ tokens of code data. CodeLlama 70B excels at code completion, generation, and infilling across many programming languages.
Context
100K tokens
Params
70B
Pricing
Free (self-hosted)
Benchmark Score
MusicGen
Meta
Meta's open-source music generation model that creates instrumental music from text descriptions or melody inputs. MusicGen is self-hostable and supports melody conditioning, making it valuable for developers building music AI applications.
Context
N/A
Params
3.3B
Pricing
Free (open-source)
Benchmark Score
Mistral Small
Mistral
Mistral's most cost-efficient commercial model, optimized for low-latency and high-throughput applications. Ideal for tasks that need decent quality at minimal cost, such as classification, extraction, and simple generation.
Context
128K tokens
Params
Unknown
Pricing
$0.20/1M input, $0.60/1M out...
Benchmark Score
StarCoder2
BigCode
An open-source code LLM trained on The Stack v2, one of the largest permissively licensed code datasets. StarCoder2 supports 600+ programming languages and is designed for transparent, responsible code generation.
Context
16K tokens
Params
15B
Pricing
Free (open-source, BigCode O...
Benchmark Score
Llama 3.2 8B
Meta
A lightweight open-weight model designed for edge deployment and resource-constrained environments. Llama 3.2 8B offers solid performance for its size, supporting on-device inference and privacy-first applications.
Context
128K tokens
Params
8B
Pricing
Free (self-hosted)
Benchmark Score
Learn to Build With These Models
Knowing about AI models is just the beginning. In CodeLeap's 8-week AI Bootcamp, you'll get hands-on experience building real applications with Claude, GPT-4, Gemini, Stable Diffusion, and more. Go from understanding models to deploying production-ready AI solutions.
Model specifications, pricing, and benchmark scores are based on publicly available information as of March 2026. Benchmark scores are normalized across different evaluation frameworks for comparison purposes. Actual performance may vary based on specific use cases and configurations. Pricing is subject to change by providers. All trademarks belong to their respective owners.