Models

Google's low-latency Gemini 3-series workhorse for straightforward multimodal tasks at scale. It is designed for high-frequency agent routing, extraction, translation, and summarization work.

Proprietary 1.0M ctx

Used in 18 solutions →

Gemini 3.1 Flash Live Preview

Google · Realtime-Voice

Google's current low-latency audio-to-audio Live API model for real-time dialogue and voice-first applications. It replaces the earlier Gemini Live surface with the Gemini 3.1 stack.

Proprietary 131k ctx

Try on: Google AI Studio

Gemini 3.1 Pro Preview

Google's top Gemini 3-series model for advanced reasoning, coding, and agentic workflows. It improves the Gemini 3 Pro line with better thinking, tool use, and factual consistency.

Proprietary 1.0M ctx

Gemini 3 Flash Preview

Google's fast Gemini 3-series model. It targets frontier-class multimodal understanding and agentic coding behavior at a lower cost tier than Pro.

Proprietary 1.0M ctx

Gemma 4 31B

Gemma Terms of Use Open weights Runs locally

Google's flagship open-weight Gemma 4. Natively multimodal (text + vision); supersedes Gemma 2 with significantly improved capabilities and a larger size.

Try on: Hugging Face · Google AI Studio

Veo 3.1

Google · Video-Gen

Google's flagship video generation model. Adds advanced creative controls and improved prompt adherence on top of the Veo 3 native-audio foundation.

Try on: Gemini API · Vertex AI · Gemini app

Veo 3.1 Lite

Google · Video-Gen

Google's efficient, developer-first variant of Veo 3.1. Lower cost and faster generation; same family as the main 3.1 model with reduced fidelity for tighter feedback loops.

Try on: Gemini API · Vertex AI

DeepSeek License Open weights Runs locally

DeepSeek V4 Flash

DeepSeek · Text

DeepSeek's efficient tier of the V4 generation. Faster and cheaper than V4 Pro; the practical default for high-throughput agentic workloads.

Try on: Hugging Face · DeepSeek Platform

DeepSeek License Open weights Runs locally

DeepSeek V4 Pro

DeepSeek · Text

DeepSeek's flagship general-purpose MoE model. Successor to V3; competitive with closed frontier-tier models at open-weights cost.

Try on: Hugging Face · DeepSeek Platform

Qwen License Open weights Runs locally

Qwen 3.5 122B

Alibaba · Text

Alibaba's flagship open-weight model from the Qwen 3.5 generation. MoE architecture with 122B total / 10B active parameters; native multimodal (text + vision).

Try on: Hugging Face · Ollama

Released Apr 2026 · Verified 2026-05-10

GPT Image 2

OpenAI · Image-Gen

OpenAI's current image generation and editing model. It replaces the older DALL·E line with a single state-of-the-art model for image creation and edits.

Released Apr 2026 · Verified 2026-05-10

Mistral Medium 3.5

Modified MIT Open weights Runs locally 256k ctx

Mistral's frontier-class multimodal model for agentic and coding use cases. It sits between the largest flagship tier and the lighter Small line while remaining open-weight.

Released Apr 2026 · Verified 2026-05-10

Mistral Small 4

Open weights Runs locally 256k ctx

Mistral's hybrid open model that unifies instruct, reasoning, and coding in a single efficient line. It is the current small generalist flagship in Mistral's open lineup.

Released Mar 2026 · Verified 2026-05-10

FLUX.2 [dev]

Black Forest Labs · Image-Gen

Black Forest Labs' next-generation open-weight image model. Supersedes FLUX.1 [dev] with improved quality and control; remains the open-weights choice for self-hosted image generation.

FLUX.2 Non-Commercial License Open weights Runs locally

Try on: Hugging Face · ComfyUI

Released Feb 2026 · Verified 2026-05-10

Claude 4.7 Opus

Anthropic · Text

Anthropic's flagship model — the strongest Claude variant for analysis, long-context reasoning, and complex agentic work. Default choice when you want the highest-quality Claude output.

Proprietary 200k → 1M ctx

Try on: Anthropic API · AWS Bedrock · Claude.ai

Used in 9 solutions →

Released Feb 2026 · Verified 2026-05-10

OCR 3

Mistral AI · Vision

Mistral's current OCR service for its Document AI stack. It extracts interleaved text and images from documents and replaces the older Mistral OCR line.

Mistral Premier Proprietary

Try on: Mistral La Plateforme

Open weights Runs locally 256k ctx

Devstral 2

Mistral AI · Code

Mistral's frontier code-agents model for software engineering tasks. It is designed for tool-heavy coding workflows across whole repositories and multi-file edits.

Ministral 3 14B

Open weights Runs locally 256k ctx

The largest model in Mistral's Ministral 3 family. It is built for local deployment with strong text and vision performance on diverse hardware.

Mistral Large 3

Mistral Research License Proprietary

Mistral AI's flagship model. Successor to Mistral Large 2 with improved multilingual coverage and reasoning. EU-jurisdiction provider.

Try on: Mistral La Plateforme · Le Chat · Hugging Face

Try on: Anthropic API · AWS Bedrock

Claude Haiku 4.5

Anthropic · Text

Anthropic's fast, cheap tier. The right choice for high-throughput agentic work and tasks where latency matters more than depth.

Proprietary 200k ctx

Used in 47 solutions →

Released Oct 2025 · Verified 2026-05-10

Claude 4.6 Sonnet

Anthropic · Text

Anthropic's mid-tier model — the practical default for production workloads. Balances quality and cost for most applications.

Proprietary 200k → 1M ctx

Try on: Anthropic API · AWS Bedrock · Google Vertex AI

Released Sep 2025 · Verified 2026-05-10

Magistral Medium 1.2

Mistral AI · Reasoning

Mistral's frontier-class multimodal reasoning model. It is the dedicated reasoning line for deeper multi-step analysis where Mistral Small 4 or Medium 3.5 would be too shallow.

Mistral Premier Proprietary 128k ctx

Try on: Mistral La Plateforme

Released Sep 2025 · Verified 2026-05-10

Gemini 2.5 Deep Think

Google · Reasoning

Google's enhanced reasoning mode on top of Gemini 2.5 Pro. Trades latency for depth on hard math, science, and multi-step problem-solving.

Try on: Gemini app (Pro/Ultra plans) · Google AI Studio

Released Aug 2025 · Verified 2026-05-10

Codestral 25.08

Mistral AI · Code

Mistral's current code-completion model, released at the end of July 2025. It is tuned for low-latency fill-in-the-middle and high-frequency code generation tasks.

Mistral Premier Proprietary 128k ctx

Try on: Mistral La Plateforme

Released Jul 2025 · Verified 2026-05-10

Llama 4 Maverick

Llama 4 Community License Open weights Runs locally

Meta's flagship Llama 4 model — natively multimodal, larger MoE architecture than Scout. The Llama 4 frontier-tier entry.

Try on: Hugging Face · OpenRouter

Released May 2025 · Verified 2026-05-10

Llama 4 Scout

Llama 4 Community License Open weights Runs locally

Meta's small/efficient Llama 4 variant — natively multimodal MoE architecture. The practical Llama 4 entry point for self-hosted multimodal applications.

Try on: Hugging Face · OpenRouter

Released May 2025 · Verified 2026-05-10

o3

OpenAI · Reasoning

OpenAI's flagship reasoning model. Uses extended chain-of-thought before answering, trading latency for depth on math, science, and complex coding tasks.

Proprietary 200k ctx

Try on: OpenAI API · ChatGPT

Try on: OpenAI API · ChatGPT

o4-mini

OpenAI · Reasoning

OpenAI's small reasoning model. Faster and cheaper than o3 while keeping the chain-of-thought architecture; the practical default for routine reasoning tasks.

Proprietary 200k ctx

Kling 2.0

Kuaishou · Video-Gen

Kuaishou's video generation model. Strong on human motion and physical realism; popular for portrait and character-driven generation.

Try on: Kling · Replicate

GPT-4.1

OpenAI's developer-first frontier model for coding, instruction following, and long-context work. It is the API-oriented successor line to older GPT-4 variants.

GPT-4.1 mini

OpenAI's smaller GPT-4.1 variant. It keeps the 1M-token context window while lowering cost and latency enough for high-volume agent and application workloads.

Used in 47 solutions →

Gemini 2.5 Flash

Google's speed-optimised tier. Cheap and fast multimodal, with a generous free tier on AI Studio for prototyping.

Gemini 2.5 Pro

Google's flagship multimodal model. Massive context window and competitive frontier-tier performance, with extended thinking on demand.

Try on: Google AI Studio · Vertex AI · Gemini app

Released Mar 2025 · Verified 2026-05-10

GPT-4o mini TTS

OpenAI · Text-To-Speech

OpenAI's current text-to-speech model, built on GPT-4o mini. It replaces the older tts-1 line with better quality and a newer multimodal stack.

Released Mar 2025 · Verified 2026-05-10

DeepSeek R1

DeepSeek · Reasoning

DeepSeek's open-weight reasoning model. Released with full weights and a permissive MIT license — the first competitive open reasoning model.

MIT Open Runs locally 128k ctx

Try on: Hugging Face · DeepSeek Platform · OpenRouter

Released Jan 2025 · Verified 2026-05-10

Pika 2

Pika Labs · Video-Gen

Pika's video generation model. Differentiates with Scene Ingredients (drop in characters/objects across shots) — the right pick when you need character consistency across clips.

Try on: Pika

MIT Open Runs locally 16k ctx

Phi-4

Microsoft · Text

Microsoft's small model trained heavily on synthetic data. Punches above its 14B weight on reasoning and math; MIT-licensed and runs locally.

Try on: Hugging Face · Ollama

Sora

OpenAI · Video-Gen

OpenAI's flagship video generation model. Up to 20-second clips at 1080p with strong prompt adherence and physics simulation.

Try on: Sora

Llama 3.3 70B

Llama 3 Community License Open weights Runs locally 128k ctx

Meta's latest 70B-parameter open-weight model. Reaches frontier-tier performance for English-centric tasks while remaining self-hostable.

Try on: Hugging Face · OpenRouter · Groq

Apache 2.0 Open Runs locally 33k ctx

Qwen QwQ-32B

Alibaba · Reasoning

Alibaba's open-weight reasoning model. 32B parameters with a permissive Apache 2.0 license — the practical reasoning model that fits on a workstation.

Try on: Hugging Face · Ollama · OpenRouter

Released Nov 2024 · Verified 2026-05-10

Suno v4

Suno · Audio-Gen

Suno's flagship music generation model. Produces full songs (vocals + instrumentation) from a text prompt — the most polished consumer music AI.

Released Nov 2024 · Verified 2026-05-10

Try on: Suno · Suno API

Qwen2.5-Coder 32B

Alibaba · Code

Alibaba's flagship open code model. 32B parameters and Apache 2.0 — the strongest open coding model that fits on a single workstation GPU.

Apache 2.0 Open Runs locally 131k ctx

Try on: Hugging Face · Ollama · OpenRouter

Released Nov 2024 · Verified 2026-05-10

Stable Diffusion 3.5

Stability AI · Image-Gen

Stability AI's open image-gen family. Three sizes (Large, Large Turbo, Medium) — runs locally on consumer GPUs and supports a massive ecosystem of LoRAs and ControlNets.

Stability AI Community License Open weights Runs locally

Try on: Hugging Face · ComfyUI · Stable Assistant

Released Oct 2024 · Verified 2026-05-10

Whisper large-v3 Turbo

OpenAI · Speech-To-Text

OpenAI's distilled Whisper variant. ~8× faster than large-v3 with most of the accuracy retained — the practical default for high-throughput STT pipelines.

Try on: Hugging Face · Replicate · Groq

Released Oct 2024 · Verified 2026-05-10

Llama 3.2 Vision

Meta · Vision

Meta's open-weight vision-language family. 11B and 90B variants — the practical open-weights vision model for self-hosted multimodal applications.

Llama 3.2 Community License Open weights Runs locally 128k ctx

Try on: Hugging Face · OpenRouter · Ollama

Qwen License Open weights Runs locally 131k ctx

Qwen 2.5 72B

Alibaba · Text

Alibaba's flagship open-weight model. Strong on coding, math, and Chinese-language tasks; competitive with Llama 3.3 70B on Western benchmarks.

Try on: Hugging Face · OpenRouter · Ollama

Apache 2.0 Open Runs locally 131k ctx

Qwen 2.5 7B

Alibaba · Text

Alibaba's small-tier open model. Apache-licensed, runs on consumer hardware, and remains competitive with other 7B-class models on coding and math.

Try on: Hugging Face · Ollama · OpenRouter

Voyage 3

Voyage AI · Embeddings

Voyage AI's flagship embedding model. Top of MTEB across many tasks; the embedding service Anthropic recommends for Claude RAG workloads.

Proprietary 32k ctx

Try on: Voyage API

CC BY 4.0 Open weights Runs locally

NVIDIA Parakeet

NVIDIA · Speech-To-Text

NVIDIA's English-focused STT model. Top of HF Open ASR Leaderboard for English and very fast on NVIDIA hardware via NeMo.

Try on: Hugging Face · NVIDIA NGC

Black Forest Labs · Image-Gen

FLUX.1 [pro]

Black Forest Labs' commercial flagship. The same architecture as FLUX.1 [dev] tuned higher; the closed tier when you need commercial-use rights.

Try on: BFL API · Replicate · Fal AI

Released Aug 2024 · Verified 2026-05-10

Llama 3.1 8B

Llama 3 Community License Open weights Runs locally 128k ctx

Meta's small open-weight model. Runs on consumer hardware (16GB GPU or modern laptop) and remains a strong default for local-first AI.

Try on: Hugging Face · Ollama · Groq

Released Jul 2024 · Verified 2026-05-10

GPT-4o mini

Try on: OpenAI API · Azure OpenAI

OpenAI's small, cheap, fast frontier model. The default workhorse for high-volume tasks where GPT-4o would be overkill.

Proprietary 128k ctx

Released Jul 2024 · Verified 2026-05-10

DeepSeek-Coder V2

DeepSeek · Code

DeepSeek's open code model. MoE architecture with strong coverage across 338 programming languages; the open-weights coder of choice for high-end self-hosting.

DeepSeek License Open weights Runs locally 128k ctx

Try on: Hugging Face · DeepSeek Platform · Ollama

Released Jun 2024 · Verified 2026-05-10

Runway Gen-3

Runway · Video-Gen

Runway's video generation model. Strong creative-tooling ecosystem (motion brush, camera control, style transfer); the production tool of choice for many video creators.

Try on: Runway · Runway API

Released Jun 2024 · Verified 2026-05-10

Luma Dream Machine

Luma AI · Video-Gen

Luma AI's video generation model. Strong on cinematic camera moves and 3D-aware generation; the right pick when production-feel camera language matters.

Try on: Luma Dream Machine · Luma API

Released Jun 2024 · Verified 2026-05-10

GPT-4o

Try on: OpenAI API · Azure OpenAI · ChatGPT

OpenAI's flagship multimodal model — text, vision, and realtime voice in one model. The default "omni" frontier model.

Proprietary 128k ctx

Used in 9 solutions →

Released May 2024 · Verified 2026-05-10

Cohere Rerank v3

Cohere · Reranking

Cohere's flagship reranker. The standard second-pass model after a vector or BM25 retrieval — bumps precision noticeably with minimal architecture changes.

Try on: Cohere API · AWS Bedrock

Released Apr 2024 · Verified 2026-05-10

Udio

Uncharted Labs · Audio-Gen

Suno's main competitor. Music generation with strong genre control and a focus on song-structure quality.

Released Apr 2024 · Verified 2026-05-10

Try on: Udio

Stable Audio 2.5

Stability AI · Audio-Gen

Stability AI's open-weight audio model. Generates music tracks, sound effects, and audio loops from text prompts — the open alternative to Suno/Udio.

Stability AI Community License Open weights Runs locally

Try on: Hugging Face · Stable Audio Studio

Released Apr 2024 · Verified 2026-05-10

BGE Reranker v2

BAAI · Reranking

BAAI's open reranker. Apache 2.0 weights, multiple sizes, and the open-weights default for self-hosted RAG pipelines that need a second-pass.

Apache 2.0 Open Runs locally

Try on: Hugging Face · FlagEmbedding

Released Mar 2024 · Verified 2026-05-10

StarCoder2 15B

BigCode · Code

BigCode's collaborative code model. 15B parameters trained on 600+ programming languages; strong fit for IDE completion and self-hosted code search.

BigCode OpenRAIL-M Open weights Runs locally 16k ctx

Try on: Hugging Face · Ollama

Released Feb 2024 · Verified 2026-05-10

BGE-M3

BAAI · Embeddings

BAAI's multi-functional embedding model. Supports dense, sparse, and multi-vector retrieval in one model — the strongest open-weights embedding option.

MIT Open Runs locally 8k ctx

Try on: Hugging Face · FlagEmbedding

Try on: OpenAI API · Azure OpenAI

OpenAI text-embedding-3-large

OpenAI · Embeddings

OpenAI's largest embedding model. 3072 dimensions, multilingual, and the default high-quality option for RAG and semantic search.

Proprietary 8k ctx

Try on: OpenAI API · Azure OpenAI

OpenAI text-embedding-3-small

OpenAI · Embeddings

OpenAI's small embedding tier. 1536 dimensions; the cheap default for most RAG and semantic-search workloads where quality is sufficient.

Proprietary 8k ctx

Magnific AI

Magnific · Image-Enhance

The premium 'creative upscaler'. Invents detail at high zoom factors using diffusion priors — the right choice when you want an upscale that adds fidelity rather than just enlarging pixels.

Try on: Magnific · Freepik (integrated)

Midjourney v6

Midjourney · Image-Gen

Midjourney's flagship image generator. Strong artistic quality and a distinctive aesthetic; bound to its Discord and web product, no public API.

Released Dec 2023 · Verified 2026-05-10

Try on: Midjourney

Whisper large-v3

OpenAI · Speech-To-Text

High-accuracy multilingual speech-to-text. Best-in-class for non-English audio; the de-facto open baseline.

Try on: Hugging Face · Replicate · Groq

Used in 3 solutions →

Try on: Cohere API · AWS Bedrock

Cohere Embed v3

Cohere · Embeddings

Cohere's flagship embedding model. Strong multilingual coverage and built-in support for compressed (int8/binary) embeddings — useful for cost-sensitive RAG.

Proprietary 512 ctx

Coqui Public Model License Open weights Runs locally

Coqui XTTS v2

Coqui · Text-To-Speech

Coqui's open multilingual TTS. Supports voice cloning with a 6-second sample across 17 languages — the leading open alternative to ElevenLabs.

Try on: Hugging Face · GitHub

Hugging Face · Speech-To-Text

Distil-Whisper

Hugging Face's distillation of Whisper. ~6× faster than the original at small accuracy cost; English-only — pick when you don't need multilingual.

Try on: Hugging Face

ElevenLabs · Text-To-Speech

ElevenLabs Multilingual v2

ElevenLabs' flagship multilingual TTS. The benchmark for natural-sounding speech and voice cloning across 29+ languages.

Try on: ElevenLabs API · ElevenLabs Studio

Released Aug 2023 · Verified 2026-05-10

Piper

Rhasspy · Text-To-Speech

A fast, local TTS designed for Raspberry Pi-class hardware. Powers most self-hosted voice assistants where Whisper handles input and Piper handles output.