AI Model Directory: Compare 53+ Models (LLMs, Image, Video, Code) | CodeLeap

Total Models

Providers

Open Source/Weight

Commercial

API Available

Avg. Benchmark

Sort:

53 models found

Top 10 Models by Benchmark Score

Models by Category

Claude 4 Opus

Anthropic

LLMCommercial

Anthropic's most capable model, excelling at complex reasoning, coding, math, and creative writing. Claude 4 Opus delivers near-perfect performance on graduate-level reasoning benchmarks and is the gold standard for agentic coding tasks.

Context

200K tokens

Params

Unknown

Pricing

$15/1M input, $75/1M output

Benchmark Score

Best-in-class reasoning and codingExceptional instruction followingStrong safety and alignment

Site

o3

OpenAI

LLMCommercial

OpenAI's most advanced reasoning model, using extended thinking to solve complex math, science, and coding problems. o3 achieves state-of-the-art results on competition-level mathematics and PhD-level science benchmarks.

Context

200K tokens

Params

Unknown

Pricing

$10/1M input, $40/1M output

Benchmark Score

Best-in-class reasoningExceptional math performanceStrong scientific understanding

Site

Claude Code

Anthropic

CodeCommercial

Anthropic's agentic coding tool that operates directly in your terminal. Claude Code can read entire codebases, write and edit files, run commands, manage git workflows, and execute multi-step engineering tasks autonomously.

Context

200K tokens

Params

Uses Claude 4 Opus/Sonnet

Pricing

Included with Claude Pro ($2...

Benchmark Score

Full terminal and filesystem accessUnderstands entire codebasesAutonomous multi-step execution

Site

ElevenLabs

AudioCommercial

The industry-leading text-to-speech platform producing hyper-realistic, emotionally expressive voice synthesis. ElevenLabs supports 32 languages, voice cloning, and real-time streaming, making it the gold standard for AI voice generation.

Context

N/A

Params

Unknown

Pricing

$5-330/month subscription or...

Benchmark Score

Most natural-sounding voicesVoice cloning capability32 language support

Site

Grok 3

xAI

LLMCommercial

xAI's flagship model trained on one of the largest GPU clusters ever assembled. Grok 3 achieves top-tier performance on math, science, and coding benchmarks, with real-time access to X (Twitter) data for up-to-date information.

Context

128K tokens

Params

Unknown

Pricing

$3/1M input, $15/1M output

Benchmark Score

Top-tier benchmark scoresReal-time X data accessStrong reasoning and math

Site

Claude 4 Opus (Multimodal)

Anthropic

MultimodalCommercial

Claude 4 Opus with vision capabilities for understanding images, charts, diagrams, and screenshots. While primarily text-focused, Claude's visual understanding excels at document analysis, code screenshot interpretation, and data visualization comprehension.

Context

200K tokens

Params

Unknown

Pricing

$15/1M input, $75/1M output

Benchmark Score

Excellent document understandingStrong chart and data interpretationCode screenshot analysis

Site

Claude 4 Sonnet

Anthropic

LLMCommercial

The ideal balance of speed, cost, and intelligence. Claude 4 Sonnet matches or exceeds many competitors' flagship models while being significantly faster and cheaper than Opus. Excellent for production workloads.

Context

200K tokens

Params

Unknown

Pricing

$3/1M input, $15/1M output

Benchmark Score

Excellent speed-to-quality ratioStrong coding capabilitiesCost-effective for production

Site

Midjourney v6.1

Midjourney

ImageCommercial

The leading AI image generation model known for producing stunning, artistic, and photorealistic visuals. Midjourney v6.1 delivers exceptional aesthetic quality and is the go-to tool for professional creatives and designers.

Context

N/A

Params

Unknown

Pricing

$10-60/month subscription

Benchmark Score

Best aesthetic qualityStunning photorealismStrong artistic styles

text-embedding-3-large

OpenAI

EmbeddingCommercial

OpenAI's most capable text embedding model for converting text into high-dimensional vectors. With 3,072 dimensions and Matryoshka Representation Learning, it supports flexible dimension reduction while maintaining strong semantic capture.

Context

8,191 tokens

Params

Unknown

Pricing

$0.13/1M tokens

Benchmark Score

High-quality semantic representationsFlexible dimension sizesStrong benchmark performance

Site

Gemini 2.0 Pro

Google

LLMCommercial

Google's most capable model in the Gemini 2.0 family, delivering strong performance across coding, math, and multimodal tasks. Features native tool use, grounding with Google Search, and a massive context window for complex workflows.

Context

2M tokens

Params

Unknown

Pricing

$1.25/1M input, $10/1M output

Benchmark Score

Massive 2M token context windowStrong multimodal capabilitiesNative Google Search grounding

Site

DeepSeek R1

DeepSeek

LLMOpen Source

DeepSeek's reasoning-focused model that matches OpenAI o1's performance through innovative reinforcement learning. DeepSeek R1 demonstrates extended chain-of-thought reasoning with full transparency into its thinking process.

Context

128K tokens

Params

671B (37B active)

Pricing

$0.55/1M input, $2.19/1M out...

Benchmark Score

Open-source reasoning modelTransparent chain-of-thoughtMatches o1 on key benchmarks

Site

Gemini 2.0 Pro (Multimodal)

Google

MultimodalCommercial

Google's natively multimodal model that understands and generates across text, images, audio, and video. Gemini 2.0 Pro's multimodal capabilities are built from the ground up, with a massive 2M token context window for processing long videos and documents.

Context

2M tokens

Params

Unknown

Pricing

$1.25/1M input, $10/1M output

Benchmark Score

Massive 2M context for video/documentsNatively multimodal architectureStrong video understanding

Site

Voyage AI

EmbeddingCommercial

Specialized embedding models optimized for code and technical content retrieval. Voyage AI offers domain-specific models (voyage-code-3, voyage-3) that consistently rank at the top of code retrieval benchmarks.

Context

16K tokens

Params

Unknown

Pricing

$0.06/1M tokens (voyage-3) -...

Benchmark Score

Best-in-class code embeddingsLong context support (16K)Domain-specific models

Site

GPT-4.5

OpenAI

LLMCommercial

OpenAI's largest and most knowledgeable model, focused on unsupervised learning improvements. GPT-4.5 delivers reduced hallucinations, better world knowledge, and improved emotional intelligence compared to GPT-4o.

Context

128K tokens

Params

Unknown

Pricing

$75/1M input, $150/1M output

Benchmark Score

Reduced hallucinationsBroad world knowledgeStrong emotional intelligence

Site

Veo 2

Google

VideoCommercial

Google's most advanced video generation model, capable of producing 4K resolution videos with cinematic quality. Veo 2 understands physics, lighting, and human motion, with support for various cinematic styles and camera movements.

Context

N/A

Params

Unknown

Pricing

Available through VideoFX an...

Benchmark Score

4K resolution outputExcellent cinematic qualityStrong physics understanding

Site

Whisper v3

OpenAI

AudioOpen Source

OpenAI's open-source speech recognition model that delivers near-human accuracy across 100+ languages. Whisper v3 handles transcription, translation, and language detection with remarkable robustness to accents, background noise, and technical jargon.

Context

30 seconds audio chunks

Params

1.5B

Pricing

$0.006/minute via API or Fre...

Benchmark Score

Open-source and self-hostable100+ language supportRobust to noise and accents

Site

GPT-4o (Multimodal)

OpenAI

MultimodalCommercial

The multimodal facet of GPT-4o, processing text, images, audio, and video natively within a single model. GPT-4o's multimodal capabilities enable seamless cross-modal reasoning, image understanding, and real-time voice conversations.

Context

128K tokens

Params

Unknown

Pricing

$2.50/1M input, $10/1M output

Benchmark Score

Native multi-modal processingReal-time voice capabilitiesStrong visual understanding

Site

Cohere Embed v3

Cohere

EmbeddingCommercial

Cohere's leading embedding model offering state-of-the-art performance for search and retrieval. Embed v3 supports 100+ languages, compression to binary/int8 formats, and separate search/document embedding types for optimal retrieval.

Context

512 tokens

Params

Unknown

Pricing

$0.10/1M tokens

Benchmark Score

100+ language supportCompression for efficient storageSearch vs. document embedding types

Site

GPT-4o

OpenAI

LLMCommercial

OpenAI's flagship multimodal model combining text, vision, and audio capabilities. GPT-4o (omni) processes and generates text, images, and audio natively, delivering fast, intelligent responses across modalities.

Context

128K tokens

Params

Unknown

Pricing

$2.50/1M input, $10/1M output

Benchmark Score

Native multimodal capabilitiesFast inference speedStrong all-around performance

Site

Mistral Large

Mistral

LLMCommercial

Mistral's flagship commercial model delivering top-tier performance on reasoning, coding, and multilingual tasks. Mistral Large competes directly with GPT-4o and Claude 3.5 Sonnet, with particularly strong performance on European languages.

Context

128K tokens

Params

Unknown

Pricing

$2/1M input, $6/1M output

Benchmark Score

Excellent multilingual supportStrong reasoningCompetitive pricing

Site

DeepSeek V3

DeepSeek

LLMOpen Source

A groundbreaking open-source model that rivals closed-source giants at a fraction of the training cost. DeepSeek V3 uses a mixture-of-experts architecture with 671B total parameters and achieves remarkable efficiency through innovative training techniques.

Context

128K tokens

Params

671B (37B active)

Pricing

$0.27/1M input, $1.10/1M out...

Benchmark Score

Fully open-sourceExceptional cost efficiencyStrong coding and math

Site

Flux 1.1 Pro

Black Forest Labs

ImageCommercial

A state-of-the-art image generation model from the creators of Stable Diffusion. Flux 1.1 Pro produces exceptionally detailed and coherent images with strong prompt adherence and photorealistic quality, quickly becoming a favorite among AI artists.

Context

N/A

Params

12B

Pricing

$0.04/image via API

Benchmark Score

Exceptional image qualityStrong prompt followingFast generation speed

Site

Sora

OpenAI

VideoCommercial

OpenAI's text-to-video model capable of generating realistic, high-quality videos up to 1 minute long. Sora understands physical world dynamics, object permanence, and cinematic techniques, producing coherent videos from text prompts.

Context

N/A

Params

Unknown

Pricing

Included with ChatGPT Plus (...

Benchmark Score

High visual qualityGood understanding of physicsUp to 1-minute videos

Suno v4

Suno

AudioCommercial

A revolutionary AI music generation platform that creates full songs with vocals, instruments, and lyrics from text prompts. Suno v4 produces remarkably human-sounding music across genres, from pop and rock to classical and hip-hop.

Context

N/A

Params

Unknown

Pricing

$10-30/month subscription

Benchmark Score

Full song generation with vocalsWide genre coverageImpressive musical quality

o3-mini

OpenAI

LLMCommercial

A smaller, faster version of o3 that still excels at reasoning tasks. o3-mini provides strong mathematical and coding capabilities at a more accessible price point, with adjustable reasoning effort levels.

Context

200K tokens

Params

Unknown

Pricing

$1.10/1M input, $4.40/1M out...

Benchmark Score

Strong reasoning at lower costAdjustable thinking effortFast for a reasoning model

Site

Llama 3.3 70B

Meta

LLMOpen Weight

Meta's latest open-weight model offering performance comparable to the larger Llama 3.1 405B at a fraction of the compute cost. Llama 3.3 70B excels at instruction following, coding, and reasoning tasks.

Context

128K tokens

Params

70B

Pricing

Free (self-hosted) or ~$0.50...

Benchmark Score

Open weights for customizationStrong coding capabilitiesCompetitive with commercial models

Site

Imagen 3

Google

ImageCommercial

Google's highest quality image generation model, available through Gemini and the Imagen API. Imagen 3 excels at photorealism, artistic styles, and accurate text rendering, with strong safety measures built in.

Context

N/A

Params

Unknown

Pricing

$0.03/image via Vertex AI

Benchmark Score

Excellent photorealismGood text renderingStrong safety filters

Site

Gemini 1.5 Pro

Google

LLMCommercial

The previous generation flagship from Google, still widely used for its reliable performance and massive context window. Gemini 1.5 Pro handles long-form content, code analysis, and multimodal tasks effectively.

Context

2M tokens

Params

Unknown

Pricing

$1.25/1M input, $5/1M output

Benchmark Score

2M token context windowStrong video understandingReliable and well-tested

Site

Codex CLI

OpenAI

CodeOpen Source

OpenAI's open-source command-line coding agent that uses o3 and o4-mini models. Codex CLI can read files, write code, execute commands, and iterate on tasks in your terminal with configurable autonomy levels.

Context

200K tokens

Params

Uses o3/o4-mini

Pricing

Free (open-source), uses Ope...

Benchmark Score

Open-source toolConfigurable autonomy levelsStrong reasoning with o3

Site

DALL-E 3

OpenAI

ImageCommercial

OpenAI's latest image generation model, tightly integrated with ChatGPT. DALL-E 3 excels at following complex prompts accurately, rendering text in images, and producing coherent compositions without the need for prompt engineering.

Context

N/A

Params

Unknown

Pricing

$0.040/image (standard) - $0...

Benchmark Score

Excellent text rendering in imagesStrong prompt adherenceIntegrated with ChatGPT

Site

Runway Gen-3 Alpha

Runway

VideoCommercial

Runway's latest text-to-video and image-to-video model used by professional filmmakers and studios. Gen-3 Alpha delivers impressive cinematic quality with strong temporal consistency, motion control, and artistic style preservation.

Context

N/A

Params

Unknown

Pricing

$12-76/month subscription

Benchmark Score

Professional-grade qualityStrong motion controlImage-to-video capability

Udio

AudioCommercial

An AI music generation platform that rivals Suno in creating high-quality, genre-diverse music. Udio excels at producing musically coherent tracks with impressive vocal synthesis and instrumental arrangements from simple text descriptions.

Context

N/A

Params

Unknown

Pricing

$10-30/month subscription

Benchmark Score

High-quality music generationStrong vocal synthesisDiverse genre support

Gemini 2.0 Flash

Google

LLMCommercial

A fast, efficient model optimized for high-throughput applications. Gemini 2.0 Flash delivers impressive quality at very low latency and cost, with multimodal input support and native tool use capabilities.

Context

1M tokens

Params

Unknown

Pricing

$0.10/1M input, $0.40/1M out...

Benchmark Score

Very fast inferenceExtremely cost-effective1M token context window

Site

Qwen 2.5 72B

Alibaba

LLMOpen Weight

Alibaba's flagship open-weight model delivering competitive performance with commercial models. Qwen 2.5 72B excels at multilingual tasks with strong support for Chinese, English, and many other languages, plus solid coding and math capabilities.

Context

128K tokens

Params

72B

Pricing

Free (self-hosted) or ~$0.30...

Benchmark Score

Excellent multilingual supportOpen weightsStrong coding capabilities

Site

Kling 1.6

Kuaishou

VideoCommercial

A video generation model from Chinese tech giant Kuaishou, known for producing high-quality, physics-aware videos. Kling 1.6 can generate videos up to 2 minutes long with impressive motion coherence and visual fidelity.

Context

N/A

Params

Unknown

Pricing

Credit-based pricing, free t...

Benchmark Score

Up to 2-minute videosGood physics simulationFree tier available

Site

Command R+

Cohere

LLMCommercial

Cohere's most powerful model, purpose-built for enterprise RAG applications, tool use, and complex agentic workflows. Command R+ excels at grounded generation with citations, making it ideal for business applications requiring verifiable outputs.

Context

128K tokens

Params

Unknown

Pricing

$2.50/1M input, $10/1M output

Benchmark Score

Best-in-class RAG performanceBuilt-in citation generationStrong tool use capabilities

Site

Ideogram 2.0

Ideogram

ImageCommercial

A specialized image generation model that leads the industry in text rendering within images. Ideogram 2.0 produces clean, accurate text in signs, logos, and typography-heavy images that other models struggle with.

Context

N/A

Params

Unknown

Pricing

$8-60/month subscription or ...

Benchmark Score

Best text rendering in imagesClean typographyGood design aesthetics

Site

Claude 3.5 Haiku

Anthropic

LLMCommercial

The fastest model in Anthropic's lineup, designed for high-throughput, low-latency applications. Despite its speed, Haiku delivers impressive quality that rivals many larger models, making it ideal for real-time applications.

Context

200K tokens

Params

Unknown

Pricing

$0.80/1M input, $4/1M output

Benchmark Score

Extremely fast inferenceVery cost-effectiveStrong for its size class

Site

Mixtral 8x22B

Mistral

LLMOpen Weight

A powerful mixture-of-experts model that activates 39B parameters per forward pass from a total of 176B. Mixtral 8x22B delivers near-flagship performance while being open-weight, making it popular for self-hosted deployments.

Context

64K tokens

Params

176B (39B active)

Pricing

Free (self-hosted) or ~$0.60...

Benchmark Score

Efficient MoE architectureOpen weightsStrong multilingual performance

Site

DeepSeek Coder V2

DeepSeek

CodeOpen Source

An open-source code-focused model that achieves GPT-4 Turbo-level coding performance. DeepSeek Coder V2 uses a mixture-of-experts architecture for efficient inference and supports 338 programming languages.

Context

128K tokens

Params

236B (21B active)

Pricing

Free (open-source)

Benchmark Score

Open-source338 language supportEfficient MoE architecture

Site

Nomic Embed

Nomic AI

EmbeddingOpen Source

A fully open-source text embedding model with published training code, data, and weights. Nomic Embed delivers competitive performance while being completely transparent and self-hostable, with a generous 8K token context window.

Context

8,192 tokens

Params

137M

Pricing

Free (open-source) or $0.10/...

Benchmark Score

Fully open-source (training code + data + weights)Self-hostableGood performance for size

Site

Mistral Medium

Mistral

LLMCommercial

A balanced mid-tier model from Mistral offering strong performance at a moderate price point. Suitable for production workloads that need better quality than small models without the cost of flagship models.

Context

128K tokens

Params

Unknown

Pricing

$1/1M input, $3/1M output

Benchmark Score

Good price-performance ratioSolid multilingual supportReliable for production use

Site

Qwen 2.5 Coder

Alibaba

LLMOpen Weight

A code-specialized variant of Qwen 2.5 optimized for programming tasks. Available in multiple sizes (1.5B to 32B), it excels at code generation, completion, and debugging across many programming languages.

Context

128K tokens

Params

32B

Pricing

Free (self-hosted)

Benchmark Score

Strong code generationMultiple size optionsOpen weights

Site

Stable Diffusion 3.5

Stability AI

ImageOpen Weight

The latest open-weight image generation model from Stability AI. SD 3.5 delivers high-quality images with improved text rendering, better anatomy, and multiple model sizes (Large, Medium, Turbo) for different use cases.

Context

N/A

Params

8B (Large), 2.6B (Medium)

Pricing

Free (self-hosted) or $0.035...

Benchmark Score

Open weights for customizationMultiple model sizesActive fine-tuning community

Site

GPT-4o mini

OpenAI

LLMCommercial

A cost-efficient small model from OpenAI that punches above its weight class. GPT-4o mini offers strong performance at a fraction of the cost, making it ideal for high-volume applications and budget-conscious deployments.

Context

128K tokens

Params

Unknown

Pricing

$0.15/1M input, $0.60/1M out...

Benchmark Score

Extremely affordableFast response timesGood quality for price point

Site

Pika 2.0

Pika

VideoCommercial

A consumer-friendly video generation platform that excels at short-form creative videos. Pika 2.0 features innovative effects like Pikaffects for adding dramatic transformations, explosions, and cinematic effects to videos.

Context

N/A

Params

Unknown

Pricing

$8-58/month subscription

Benchmark Score

Unique creative effects (Pikaffects)User-friendly interfaceFast generation

Phi-4

Microsoft

LLMOpen Weight

Microsoft's small language model that punches far above its weight. Phi-4 achieves performance comparable to much larger models through high-quality training data and synthetic data techniques, making it ideal for on-device and edge scenarios.

Context

16K tokens

Params

14B

Pricing

Free (self-hosted)

Benchmark Score

Exceptional size-to-performance ratioRuns on consumer hardwareOpen weights (MIT license)

Site

Command R

Cohere

LLMCommercial

A cost-effective model from Cohere optimized for retrieval-augmented generation and conversational search. Command R provides solid RAG performance at a lower price point than Command R+.

Context

128K tokens

Params

Unknown

Pricing

$0.15/1M input, $0.60/1M out...

Benchmark Score

Excellent for RAGVery cost-effectiveGood multilingual support

Site

CodeLlama 70B

Meta

CodeOpen Weight

Meta's code-specialized large language model based on Llama 2, fine-tuned on 500B+ tokens of code data. CodeLlama 70B excels at code completion, generation, and infilling across many programming languages.

Context

100K tokens

Params

70B

Pricing

Free (self-hosted)

Benchmark Score

Open weightsSpecialized for codeGood fill-in-the-middle capability

Site

MusicGen

Meta

AudioOpen Source

Meta's open-source music generation model that creates instrumental music from text descriptions or melody inputs. MusicGen is self-hostable and supports melody conditioning, making it valuable for developers building music AI applications.

Context

N/A

Params

3.3B

Pricing

Free (open-source)

Benchmark Score

Fully open-sourceMelody conditioning supportSelf-hostable

Site

Mistral Small

Mistral

LLMCommercial

Mistral's most cost-efficient commercial model, optimized for low-latency and high-throughput applications. Ideal for tasks that need decent quality at minimal cost, such as classification, extraction, and simple generation.

Context

128K tokens

Params

Unknown

Pricing

$0.20/1M input, $0.60/1M out...

Benchmark Score

Very cost-effectiveFast inferenceGood for simple tasks

Site

StarCoder2

BigCode

CodeOpen Source

An open-source code LLM trained on The Stack v2, one of the largest permissively licensed code datasets. StarCoder2 supports 600+ programming languages and is designed for transparent, responsible code generation.

Context

16K tokens

Params

15B

Pricing

Free (open-source, BigCode O...

Benchmark Score

Truly open-source with clear licensing600+ language supportTrained on permissive code

Site

Llama 3.2 8B

Meta

LLMOpen Weight

A lightweight open-weight model designed for edge deployment and resource-constrained environments. Llama 3.2 8B offers solid performance for its size, supporting on-device inference and privacy-first applications.

Context

128K tokens

Params

Pricing

Free (self-hosted)

Benchmark Score

Runs on consumer hardwareOpen weightsFast inference

Site

Learn to Build With These Models

Knowing about AI models is just the beginning. In CodeLeap's 8-week AI Bootcamp, you'll get hands-on experience building real applications with Claude, GPT-4, Gemini, Stable Diffusion, and more. Go from understanding models to deploying production-ready AI solutions.

Explore the Bootcamp →Compare Tracks →

Model specifications, pricing, and benchmark scores are based on publicly available information as of March 2026. Benchmark scores are normalized across different evaluation frameworks for comparison purposes. Actual performance may vary based on specific use cases and configurations. Pricing is subject to change by providers. All trademarks belong to their respective owners.