Cyberax AI Playbook
cyberax.com
Models

Models

A reference catalog of the AI models we use, recommend, or compare in the playbook — what they're for, where to try them, when we last verified each entry.

Modality
License
Runs locally

Showing 77 of 77 models.

Gemini 3.1 Flash-Lite

Google · Text

Google's low-latency Gemini 3-series workhorse for straightforward multimodal tasks at scale. It is designed for high-frequency agent routing, extraction, translation, and summarization work.

Proprietary 1.0M ctx
Try on: Google AI Studio · Vertex AI
Used in 18 solutions →
Released May 2026 · Verified 2026-05-10

Gemini 3.1 Flash Live Preview

Google · Realtime-Voice

Google's current low-latency audio-to-audio Live API model for real-time dialogue and voice-first applications. It replaces the earlier Gemini Live surface with the Gemini 3.1 stack.

Proprietary 131k ctx
Try on: Google AI Studio
Released May 2026 · Verified 2026-05-10

Gemini 3.1 Pro Preview

Google · Text

Google's top Gemini 3-series model for advanced reasoning, coding, and agentic workflows. It improves the Gemini 3 Pro line with better thinking, tool use, and factual consistency.

Proprietary 1.0M ctx
Try on: Google AI Studio · Vertex AI
Released May 2026 · Verified 2026-05-10

Gemini 3 Flash Preview

Google · Text

Google's fast Gemini 3-series model. It targets frontier-class multimodal understanding and agentic coding behavior at a lower cost tier than Pro.

Proprietary 1.0M ctx
Try on: Google AI Studio · Vertex AI
Released May 2026 · Verified 2026-05-10

Gemma 4 31B

Google · Text

Google's flagship open-weight Gemma 4. Natively multimodal (text + vision); supersedes Gemma 2 with significantly improved capabilities and a larger size.

Gemma Terms of Use Open weights Runs locally
Try on: Hugging Face · Google AI Studio
Released May 2026 · Verified 2026-05-10

Veo 3.1

Google · Video-Gen

Google's flagship video generation model. Adds advanced creative controls and improved prompt adherence on top of the Veo 3 native-audio foundation.

Proprietary
Try on: Gemini API · Vertex AI · Gemini app
Released May 2026 · Verified 2026-05-10

Veo 3.1 Lite

Google · Video-Gen

Google's efficient, developer-first variant of Veo 3.1. Lower cost and faster generation; same family as the main 3.1 model with reduced fidelity for tighter feedback loops.

Proprietary
Try on: Gemini API · Vertex AI
Released May 2026 · Verified 2026-05-10

DeepSeek V4 Flash

DeepSeek · Text

DeepSeek's efficient tier of the V4 generation. Faster and cheaper than V4 Pro; the practical default for high-throughput agentic workloads.

DeepSeek License Open weights Runs locally
Try on: Hugging Face · DeepSeek Platform
Released May 2026 · Verified 2026-05-10

DeepSeek V4 Pro

DeepSeek · Text

DeepSeek's flagship general-purpose MoE model. Successor to V3; competitive with closed frontier-tier models at open-weights cost.

DeepSeek License Open weights Runs locally
Try on: Hugging Face · DeepSeek Platform
Released May 2026 · Verified 2026-05-10

Qwen 3.5 122B

Alibaba · Text

Alibaba's flagship open-weight model from the Qwen 3.5 generation. MoE architecture with 122B total / 10B active parameters; native multimodal (text + vision).

Qwen License Open weights Runs locally
Try on: Hugging Face · Ollama
Released Apr 2026 · Verified 2026-05-10

GPT Image 2

OpenAI · Image-Gen

OpenAI's current image generation and editing model. It replaces the older DALL·E line with a single state-of-the-art model for image creation and edits.

Proprietary
Try on: OpenAI API
Released Apr 2026 · Verified 2026-05-10

Mistral Medium 3.5

Mistral AI · Text

Mistral's frontier-class multimodal model for agentic and coding use cases. It sits between the largest flagship tier and the lighter Small line while remaining open-weight.

Modified MIT Open weights Runs locally 256k ctx
Try on: Mistral La Plateforme · Mistral model card
Released Apr 2026 · Verified 2026-05-10

Mistral Small 4

Mistral AI · Text

Mistral's hybrid open model that unifies instruct, reasoning, and coding in a single efficient line. It is the current small generalist flagship in Mistral's open lineup.

Open weights Runs locally 256k ctx
Try on: Mistral La Plateforme · Mistral model card
Released Mar 2026 · Verified 2026-05-10

FLUX.2 [dev]

Black Forest Labs · Image-Gen

Black Forest Labs' next-generation open-weight image model. Supersedes FLUX.1 [dev] with improved quality and control; remains the open-weights choice for self-hosted image generation.

FLUX.2 Non-Commercial License Open weights Runs locally
Try on: Hugging Face · ComfyUI
Released Feb 2026 · Verified 2026-05-10

Claude 4.7 Opus

Anthropic · Text

Anthropic's flagship model — the strongest Claude variant for analysis, long-context reasoning, and complex agentic work. Default choice when you want the highest-quality Claude output.

Proprietary 200k → 1M ctx
Try on: Anthropic API · AWS Bedrock · Claude.ai
Used in 9 solutions →
Released Feb 2026 · Verified 2026-05-10

OCR 3

Mistral AI · Vision

Mistral's current OCR service for its Document AI stack. It extracts interleaved text and images from documents and replaces the older Mistral OCR line.

Mistral Premier Proprietary
Try on: Mistral La Plateforme
Released Dec 2025 · Verified 2026-05-10

Devstral 2

Mistral AI · Code

Mistral's frontier code-agents model for software engineering tasks. It is designed for tool-heavy coding workflows across whole repositories and multi-file edits.

Open weights Runs locally 256k ctx
Try on: Mistral La Plateforme · Mistral model card
Released Dec 2025 · Verified 2026-05-10

Ministral 3 14B

Mistral AI · Text

The largest model in Mistral's Ministral 3 family. It is built for local deployment with strong text and vision performance on diverse hardware.

Open weights Runs locally 256k ctx
Try on: Mistral La Plateforme · Mistral model card
Released Dec 2025 · Verified 2026-05-10

Mistral Large 3

Mistral AI · Text

Mistral AI's flagship model. Successor to Mistral Large 2 with improved multilingual coverage and reasoning. EU-jurisdiction provider.

Mistral Research License Proprietary
Try on: Mistral La Plateforme · Le Chat · Hugging Face
Released Dec 2025 · Verified 2026-05-10

Claude Haiku 4.5

Anthropic · Text

Anthropic's fast, cheap tier. The right choice for high-throughput agentic work and tasks where latency matters more than depth.

Proprietary 200k ctx
Try on: Anthropic API · AWS Bedrock
Used in 47 solutions →
Released Oct 2025 · Verified 2026-05-10

Claude 4.6 Sonnet

Anthropic · Text

Anthropic's mid-tier model — the practical default for production workloads. Balances quality and cost for most applications.

Proprietary 200k → 1M ctx
Try on: Anthropic API · AWS Bedrock · Google Vertex AI
Used in 2 solutions →
Released Sep 2025 · Verified 2026-05-10

Magistral Medium 1.2

Mistral AI · Reasoning

Mistral's frontier-class multimodal reasoning model. It is the dedicated reasoning line for deeper multi-step analysis where Mistral Small 4 or Medium 3.5 would be too shallow.

Mistral Premier Proprietary 128k ctx
Try on: Mistral La Plateforme
Released Sep 2025 · Verified 2026-05-10

Gemini 2.5 Deep Think

Google · Reasoning

Google's enhanced reasoning mode on top of Gemini 2.5 Pro. Trades latency for depth on hard math, science, and multi-step problem-solving.

Proprietary 1M ctx
Try on: Gemini app (Pro/Ultra plans) · Google AI Studio
Released Aug 2025 · Verified 2026-05-10

Codestral 25.08

Mistral AI · Code

Mistral's current code-completion model, released at the end of July 2025. It is tuned for low-latency fill-in-the-middle and high-frequency code generation tasks.

Mistral Premier Proprietary 128k ctx
Try on: Mistral La Plateforme
Released Jul 2025 · Verified 2026-05-10

Llama 4 Maverick

Meta · Text

Meta's flagship Llama 4 model — natively multimodal, larger MoE architecture than Scout. The Llama 4 frontier-tier entry.

Llama 4 Community License Open weights Runs locally
Try on: Hugging Face · OpenRouter
Released May 2025 · Verified 2026-05-10

Llama 4 Scout

Meta · Text

Meta's small/efficient Llama 4 variant — natively multimodal MoE architecture. The practical Llama 4 entry point for self-hosted multimodal applications.

Llama 4 Community License Open weights Runs locally
Try on: Hugging Face · OpenRouter
Released May 2025 · Verified 2026-05-10

o3

OpenAI · Reasoning

OpenAI's flagship reasoning model. Uses extended chain-of-thought before answering, trading latency for depth on math, science, and complex coding tasks.

Proprietary 200k ctx
Try on: OpenAI API · ChatGPT
Released Apr 2025 · Verified 2026-05-10

o4-mini

OpenAI · Reasoning

OpenAI's small reasoning model. Faster and cheaper than o3 while keeping the chain-of-thought architecture; the practical default for routine reasoning tasks.

Proprietary 200k ctx
Try on: OpenAI API · ChatGPT
Released Apr 2025 · Verified 2026-05-10

Kling 2.0

Kuaishou · Video-Gen

Kuaishou's video generation model. Strong on human motion and physical realism; popular for portrait and character-driven generation.

Proprietary
Try on: Kling · Replicate
Released Apr 2025 · Verified 2026-05-10

GPT-4.1

OpenAI · Text

OpenAI's developer-first frontier model for coding, instruction following, and long-context work. It is the API-oriented successor line to older GPT-4 variants.

Proprietary 1M ctx
Try on: OpenAI API
Released Apr 2025 · Verified 2026-05-10

GPT-4.1 mini

OpenAI · Text

OpenAI's smaller GPT-4.1 variant. It keeps the 1M-token context window while lowering cost and latency enough for high-volume agent and application workloads.

Proprietary 1M ctx
Try on: OpenAI API
Used in 47 solutions →
Released Apr 2025 · Verified 2026-05-10

Gemini 2.5 Flash

Google · Text

Google's speed-optimised tier. Cheap and fast multimodal, with a generous free tier on AI Studio for prototyping.

Proprietary 1M ctx
Try on: Google AI Studio · Vertex AI
Released Apr 2025 · Verified 2026-05-10

Gemini 2.5 Pro

Google · Text

Google's flagship multimodal model. Massive context window and competitive frontier-tier performance, with extended thinking on demand.

Proprietary 1M ctx
Try on: Google AI Studio · Vertex AI · Gemini app
Used in 2 solutions →
Released Mar 2025 · Verified 2026-05-10

GPT-4o mini TTS

OpenAI · Text-To-Speech

OpenAI's current text-to-speech model, built on GPT-4o mini. It replaces the older tts-1 line with better quality and a newer multimodal stack.

Proprietary
Try on: OpenAI API
Released Mar 2025 · Verified 2026-05-10

DeepSeek R1

DeepSeek · Reasoning

DeepSeek's open-weight reasoning model. Released with full weights and a permissive MIT license — the first competitive open reasoning model.

MIT Open Runs locally 128k ctx
Try on: Hugging Face · DeepSeek Platform · OpenRouter
Released Jan 2025 · Verified 2026-05-10

Pika 2

Pika Labs · Video-Gen

Pika's video generation model. Differentiates with Scene Ingredients (drop in characters/objects across shots) — the right pick when you need character consistency across clips.

Proprietary
Try on: Pika
Released Dec 2024 · Verified 2026-05-10

Phi-4

Microsoft · Text

Microsoft's small model trained heavily on synthetic data. Punches above its 14B weight on reasoning and math; MIT-licensed and runs locally.

MIT Open Runs locally 16k ctx
Try on: Hugging Face · Ollama
Released Dec 2024 · Verified 2026-05-10

Sora

OpenAI · Video-Gen

OpenAI's flagship video generation model. Up to 20-second clips at 1080p with strong prompt adherence and physics simulation.

Proprietary
Try on: Sora
Released Dec 2024 · Verified 2026-05-10

Llama 3.3 70B

Meta · Text

Meta's latest 70B-parameter open-weight model. Reaches frontier-tier performance for English-centric tasks while remaining self-hostable.

Llama 3 Community License Open weights Runs locally 128k ctx
Try on: Hugging Face · OpenRouter · Groq
Used in 1 solution →
Released Dec 2024 · Verified 2026-05-10

Qwen QwQ-32B

Alibaba · Reasoning

Alibaba's open-weight reasoning model. 32B parameters with a permissive Apache 2.0 license — the practical reasoning model that fits on a workstation.

Apache 2.0 Open Runs locally 33k ctx
Try on: Hugging Face · Ollama · OpenRouter
Released Nov 2024 · Verified 2026-05-10

Suno v4

Suno · Audio-Gen

Suno's flagship music generation model. Produces full songs (vocals + instrumentation) from a text prompt — the most polished consumer music AI.

Proprietary
Try on: Suno · Suno API
Released Nov 2024 · Verified 2026-05-10

Qwen2.5-Coder 32B

Alibaba · Code

Alibaba's flagship open code model. 32B parameters and Apache 2.0 — the strongest open coding model that fits on a single workstation GPU.

Apache 2.0 Open Runs locally 131k ctx
Try on: Hugging Face · Ollama · OpenRouter
Used in 1 solution →
Released Nov 2024 · Verified 2026-05-10

Stable Diffusion 3.5

Stability AI · Image-Gen

Stability AI's open image-gen family. Three sizes (Large, Large Turbo, Medium) — runs locally on consumer GPUs and supports a massive ecosystem of LoRAs and ControlNets.

Stability AI Community License Open weights Runs locally
Try on: Hugging Face · ComfyUI · Stable Assistant
Released Oct 2024 · Verified 2026-05-10

Whisper large-v3 Turbo

OpenAI · Speech-To-Text

OpenAI's distilled Whisper variant. ~8× faster than large-v3 with most of the accuracy retained — the practical default for high-throughput STT pipelines.

MIT Open Runs locally
Try on: Hugging Face · Replicate · Groq
Used in 1 solution →
Released Oct 2024 · Verified 2026-05-10

Llama 3.2 Vision

Meta · Vision

Meta's open-weight vision-language family. 11B and 90B variants — the practical open-weights vision model for self-hosted multimodal applications.

Llama 3.2 Community License Open weights Runs locally 128k ctx
Try on: Hugging Face · OpenRouter · Ollama
Used in 1 solution →
Released Sep 2024 · Verified 2026-05-10

Qwen 2.5 72B

Alibaba · Text

Alibaba's flagship open-weight model. Strong on coding, math, and Chinese-language tasks; competitive with Llama 3.3 70B on Western benchmarks.

Qwen License Open weights Runs locally 131k ctx
Try on: Hugging Face · OpenRouter · Ollama
Released Sep 2024 · Verified 2026-05-10

Qwen 2.5 7B

Alibaba · Text

Alibaba's small-tier open model. Apache-licensed, runs on consumer hardware, and remains competitive with other 7B-class models on coding and math.

Apache 2.0 Open Runs locally 131k ctx
Try on: Hugging Face · Ollama · OpenRouter
Released Sep 2024 · Verified 2026-05-10

Voyage 3

Voyage AI · Embeddings

Voyage AI's flagship embedding model. Top of MTEB across many tasks; the embedding service Anthropic recommends for Claude RAG workloads.

Proprietary 32k ctx
Try on: Voyage API
Released Sep 2024 · Verified 2026-05-10

NVIDIA Parakeet

NVIDIA · Speech-To-Text

NVIDIA's English-focused STT model. Top of HF Open ASR Leaderboard for English and very fast on NVIDIA hardware via NeMo.

CC BY 4.0 Open weights Runs locally
Try on: Hugging Face · NVIDIA NGC
Released Sep 2024 · Verified 2026-05-10

FLUX.1 [pro]

Black Forest Labs · Image-Gen

Black Forest Labs' commercial flagship. The same architecture as FLUX.1 [dev] tuned higher; the closed tier when you need commercial-use rights.

Proprietary
Try on: BFL API · Replicate · Fal AI
Released Aug 2024 · Verified 2026-05-10

Llama 3.1 8B

Meta · Text

Meta's small open-weight model. Runs on consumer hardware (16GB GPU or modern laptop) and remains a strong default for local-first AI.

Llama 3 Community License Open weights Runs locally 128k ctx
Try on: Hugging Face · Ollama · Groq
Released Jul 2024 · Verified 2026-05-10

GPT-4o mini

OpenAI · Text

OpenAI's small, cheap, fast frontier model. The default workhorse for high-volume tasks where GPT-4o would be overkill.

Proprietary 128k ctx
Try on: OpenAI API · Azure OpenAI
Used in 1 solution →
Released Jul 2024 · Verified 2026-05-10

DeepSeek-Coder V2

DeepSeek · Code

DeepSeek's open code model. MoE architecture with strong coverage across 338 programming languages; the open-weights coder of choice for high-end self-hosting.

DeepSeek License Open weights Runs locally 128k ctx
Try on: Hugging Face · DeepSeek Platform · Ollama
Released Jun 2024 · Verified 2026-05-10

Runway Gen-3

Runway · Video-Gen

Runway's video generation model. Strong creative-tooling ecosystem (motion brush, camera control, style transfer); the production tool of choice for many video creators.

Proprietary
Try on: Runway · Runway API
Released Jun 2024 · Verified 2026-05-10

Luma Dream Machine

Luma AI · Video-Gen

Luma AI's video generation model. Strong on cinematic camera moves and 3D-aware generation; the right pick when production-feel camera language matters.

Proprietary
Try on: Luma Dream Machine · Luma API
Released Jun 2024 · Verified 2026-05-10

GPT-4o

OpenAI · Text

OpenAI's flagship multimodal model — text, vision, and realtime voice in one model. The default "omni" frontier model.

Proprietary 128k ctx
Try on: OpenAI API · Azure OpenAI · ChatGPT
Used in 9 solutions →
Released May 2024 · Verified 2026-05-10

Cohere Rerank v3

Cohere · Reranking

Cohere's flagship reranker. The standard second-pass model after a vector or BM25 retrieval — bumps precision noticeably with minimal architecture changes.

Proprietary
Try on: Cohere API · AWS Bedrock
Used in 1 solution →
Released Apr 2024 · Verified 2026-05-10

Udio

Uncharted Labs · Audio-Gen

Suno's main competitor. Music generation with strong genre control and a focus on song-structure quality.

Proprietary
Try on: Udio
Released Apr 2024 · Verified 2026-05-10

Stable Audio 2.5

Stability AI · Audio-Gen

Stability AI's open-weight audio model. Generates music tracks, sound effects, and audio loops from text prompts — the open alternative to Suno/Udio.

Stability AI Community License Open weights Runs locally
Try on: Hugging Face · Stable Audio Studio
Released Apr 2024 · Verified 2026-05-10

BGE Reranker v2

BAAI · Reranking

BAAI's open reranker. Apache 2.0 weights, multiple sizes, and the open-weights default for self-hosted RAG pipelines that need a second-pass.

Apache 2.0 Open Runs locally
Try on: Hugging Face · FlagEmbedding
Released Mar 2024 · Verified 2026-05-10

StarCoder2 15B

BigCode · Code

BigCode's collaborative code model. 15B parameters trained on 600+ programming languages; strong fit for IDE completion and self-hosted code search.

BigCode OpenRAIL-M Open weights Runs locally 16k ctx
Try on: Hugging Face · Ollama
Released Feb 2024 · Verified 2026-05-10

BGE-M3

BAAI · Embeddings

BAAI's multi-functional embedding model. Supports dense, sparse, and multi-vector retrieval in one model — the strongest open-weights embedding option.

MIT Open Runs locally 8k ctx
Try on: Hugging Face · FlagEmbedding
Used in 2 solutions →
Released Jan 2024 · Verified 2026-05-10

OpenAI text-embedding-3-large

OpenAI · Embeddings

OpenAI's largest embedding model. 3072 dimensions, multilingual, and the default high-quality option for RAG and semantic search.

Proprietary 8k ctx
Try on: OpenAI API · Azure OpenAI
Used in 2 solutions →
Released Jan 2024 · Verified 2026-05-10

OpenAI text-embedding-3-small

OpenAI · Embeddings

OpenAI's small embedding tier. 1536 dimensions; the cheap default for most RAG and semantic-search workloads where quality is sufficient.

Proprietary 8k ctx
Try on: OpenAI API · Azure OpenAI
Released Jan 2024 · Verified 2026-05-10

Magnific AI

Magnific · Image-Enhance

The premium 'creative upscaler'. Invents detail at high zoom factors using diffusion priors — the right choice when you want an upscale that adds fidelity rather than just enlarging pixels.

Proprietary
Try on: Magnific · Freepik (integrated)
Released Jan 2024 · Verified 2026-05-10

Midjourney v6

Midjourney · Image-Gen

Midjourney's flagship image generator. Strong artistic quality and a distinctive aesthetic; bound to its Discord and web product, no public API.

Proprietary
Try on: Midjourney
Released Dec 2023 · Verified 2026-05-10

Whisper large-v3

OpenAI · Speech-To-Text

High-accuracy multilingual speech-to-text. Best-in-class for non-English audio; the de-facto open baseline.

MIT Open Runs locally
Try on: Hugging Face · Replicate · Groq
Used in 3 solutions →
Released Nov 2023 · Verified 2026-05-10

Cohere Embed v3

Cohere · Embeddings

Cohere's flagship embedding model. Strong multilingual coverage and built-in support for compressed (int8/binary) embeddings — useful for cost-sensitive RAG.

Proprietary 512 ctx
Try on: Cohere API · AWS Bedrock
Released Nov 2023 · Verified 2026-05-10

Coqui XTTS v2

Coqui · Text-To-Speech

Coqui's open multilingual TTS. Supports voice cloning with a 6-second sample across 17 languages — the leading open alternative to ElevenLabs.

Coqui Public Model License Open weights Runs locally
Try on: Hugging Face · GitHub
Released Nov 2023 · Verified 2026-05-10

Distil-Whisper

Hugging Face · Speech-To-Text

Hugging Face's distillation of Whisper. ~6× faster than the original at small accuracy cost; English-only — pick when you don't need multilingual.

MIT Open Runs locally
Try on: Hugging Face
Used in 1 solution →
Released Nov 2023 · Verified 2026-05-10

ElevenLabs Multilingual v2

ElevenLabs · Text-To-Speech

ElevenLabs' flagship multilingual TTS. The benchmark for natural-sounding speech and voice cloning across 29+ languages.

Proprietary
Try on: ElevenLabs API · ElevenLabs Studio
Released Aug 2023 · Verified 2026-05-10

Piper

Rhasspy · Text-To-Speech

A fast, local TTS designed for Raspberry Pi-class hardware. Powers most self-hosted voice assistants where Whisper handles input and Piper handles output.

MIT Open Runs locally
Try on: GitHub · Hugging Face voices
Released Apr 2023 · Verified 2026-05-10

Stable Diffusion x4 Upscaler

Stability AI · Image-Enhance

Stability AI's diffusion-based 4× upscaler. Trades speed for quality — invents plausible high-frequency detail rather than just sharpening, which suits AI-generated images especially well.

CreativeML Open RAIL++-M Open weights Runs locally
Try on: Hugging Face · ComfyUI · Replicate
Released Dec 2022 · Verified 2026-05-10

SwinIR

ETH Zürich · Image-Enhance

Transformer-based image restoration model. Strong on text, edges, and faces — often produces sharper results than GAN-based upscalers on photographic content.

Apache 2.0 Open Runs locally
Try on: GitHub · Hugging Face Spaces
Released Aug 2021 · Verified 2026-05-10

Real-ESRGAN

Tencent ARC Lab · Image-Enhance

The de-facto open-weights image upscaler. Battle-tested across years of community use; runs on CPU or GPU, integrates with virtually every local image pipeline.

BSD-3-Clause Open Runs locally
Try on: GitHub · Hugging Face · Replicate
Released Jul 2021 · Verified 2026-05-10

Vosk

Alpha Cephei · Speech-To-Text

An offline, lightweight speech recognition toolkit. Runs on phones, Raspberry Pi, and embedded devices — the right choice when Whisper is too heavy.

Apache 2.0 Open Runs locally
Try on: Vosk models · GitHub
Released Jan 2020 · Verified 2026-05-10

Topaz Gigapixel AI

Topaz Labs · Image-Enhance

Industry-standard desktop application for photo upscaling and restoration. The default tool for archival work, photo restoration, and print-sized enlargements where fidelity to the original matters.

Proprietary Runs locally
Try on: Topaz Labs
Released Jun 2018 · Verified 2026-05-10