Cyberax AI Playbook
cyberax.com
Comparison · Operations & Knowledge · Local-OK

Vector databases compared

pgvector, Pinecone, Qdrant, Weaviate, Milvus — what each vector database is good for, what each costs, and the honest default for teams that haven't outgrown their existing Postgres.

At a glance Last verified · May 2026
Problem solved Pick the right vector database for a RAG or similarity-search workload — and recognise when the right answer is just "use Postgres"
Best for Engineers, ops leads, founders deciding where to store embeddings for a production AI feature
Tools pgvector, Pinecone, Qdrant, Weaviate, Milvus, Chroma
Difficulty Intermediate

A vector database is a database that stores embeddings — the way AI represents meaning as numbers — and lets you search by meaning rather than by keyword. You use one when you’ve built retrieval-augmented generation (giving the AI access to your specific documents) or any feature that asks “find me the things most similar to this.” See Embeddings explained without math for the why.

The next question is where to store the embeddings. The market has gone from “one obvious choice” to “six reasonable choices with different trade-offs” in eighteen months, and most of the framing online is written by the vendors themselves. This piece is the calmer version.

The honest one-liner: for most teams in 2026, pgvector on your existing Postgres is the right starting point and you won’t need anything else for the first few million vectors. Everything below is an answer to “when does that stop being true?”

Side by side

The comparison matrix

pgvectorPineconeQdrantWeaviateMilvus / Zilliz
Hosting model Postgres extension — self-hosted on your existing DBManaged-only (cloud SaaS)Both: managed cloud + open-source self-hostBoth: managed cloud + open-source self-hostBoth: managed (Zilliz Cloud) + open-source self-host
Entry price $0 incremental on existing PostgresFree Starter tier (pay-as-you-go) · Standard from ~$50/month minimum + usageFree tier (managed) + open-source self-hostFree sandbox + paid from ~$25/monthFree tier + paid managed from ~$65/month
Practical scale ceiling Tens of millions before tuning, ~100M with HNSW + good Postgres opsHundreds of millions to low billions in managed clustersHundreds of millions per node; sharded for moreHundreds of millions per node; replication and sharding for moreBillions — designed for distributed scale-out from day one
Hybrid search (vector + keyword) Yes — combine with built-in Postgres full-text searchYes — sparse-dense hybrid built inYes — built-in BM25 + vectorYes — built-in BM25 + vectorYes — built-in scalar + vector queries
Metadata filtering Excellent — full SQL WHERE clausesGood — typed metadata filters on the APIGood — payload filters on any indexed fieldGood — typed property filtersGood — boolean expressions over scalar fields
Multi-tenancy Schema or row-level isolation in PostgresNative namespaces per indexNative multi-tenancy via collection partitionsNative multi-tenancy (1.x feature) with per-tenant isolationPartitioned collections
Framework ecosystem Strong — every Python framework has a Postgres adapterStrongest — first-class in LangChain, LlamaIndex, every frameworkStrong — first-class in LangChain, LlamaIndexStrong — Weaviate-specific integrations + LangChainStrong — first-class in LangChain, LlamaIndex
Ops burden Lowest if you already run Postgres; you own the upgrade cycleLowest overall — fully managed, no serversLow if managed; medium if self-hosted (run a service)Low if managed; medium if self-hostedHighest if self-hosted (distributed system); low on Zilliz Cloud
When to pick it You already use Postgres and the corpus is under ~50M vectors. Default choice for most teams.You want zero infrastructure and the SLA of a managed service. Willing to pay per-month-per-namespace.You want managed convenience or self-host flexibility, and value hybrid search ergonomics.You want built-in vectorisers (the DB embeds for you) or a strong schema-driven model.You have a billions-scale corpus and ops to run a distributed system, or budget for Zilliz Cloud.

Chroma is worth a mention as the sixth option — it’s developer-friendly, popular for prototypes and notebooks, and sits at a smaller-scale price point. It’s less production-hardened than the five above and is most often used as the “let me try semantic search this afternoon” tool before promoting to one of the listed five. Lance / LanceDB and Turbopuffer are newer entrants worth watching but haven’t yet accumulated the production track record of the matrix above.

The numbers

What this costs in practice

pgvector — incremental cost on existing Postgres $0 software cost; storage scales with your normal DB plan
Pinecone — Starter (free, pay-as-you-go) $0 base; usage billed per million vector operations + storage
Pinecone — Standard tier (typical production) From ~$50/month minimum + usage; lands $70–$700/month at small-to-mid scale
Qdrant Cloud — free tier 1 GB cluster, sufficient for ~1M vectors at 768 dimensions
Qdrant Cloud — paid From ~$25/month for a small managed cluster
Weaviate Cloud — entry Free 14-day sandbox; paid from ~$25/month for serverless tier
Zilliz Cloud (managed Milvus) — entry Free tier with limited resources; paid from ~$65/month
Self-hosted (Qdrant / Weaviate / Milvus) — base infra $40–$300/month for a small VPS or k8s node; ops time is the real cost
Storage per million 1,536-dim vectors ~6 GB raw; ~2× with HNSW index overhead
Per-query latency (under 10M vectors, well-indexed) < 50 ms typical across all five options
Per-query latency at billion-scale (Milvus / Pinecone) 20–100 ms p99 with proper sharding
Cost of migrating between vector DBs Re-embedding the whole corpus + porting metadata schema; usually 1–3 engineer-days

The pattern in those numbers is consistent: the vector database is rarely the line item that matters at small-to-mid scale. A team running pgvector on existing Postgres pays nothing extra; a team running Pinecone Builder pays the cost of one developer-hour per month. The migration cost — re-embedding, re-ingesting, porting filters — is what gates the decision once you’ve shipped.

The decision rule

When to leave pgvector

Start on pgvector. Move to a dedicated vector database when one of the following becomes true:

  1. Scale ceiling. You’ve crossed a hundred million vectors, or you’re spending real money on Postgres instances tuned for vector workloads. Dedicated stores handle this without you having to tune.
  2. Latency floor. You need sub-50ms p99 latency at high concurrency, and pgvector’s HNSW index is no longer keeping up under your specific query mix. (Benchmark before assuming — pgvector with HNSW is faster than most teams expect.)
  3. Feature gap. You need something pgvector doesn’t have well — strong built-in re-ranking, hybrid search with first-class BM25, native multi-tenancy with quota enforcement, multi-modal indexes with image-and-text in the same store. Each dedicated DB has one or two of these as a differentiator.
  4. Ops preference. You don’t want to run Postgres for this workload, period. A managed vector DB is the entire answer; pay the monthly bill and stop thinking about it.

If none of those are true, you don’t have a vector database problem — you have a which-Postgres-extension-to-enable problem, and you’ve already solved it.

Where each fails

Failure modes that bite in production

  • pgvector — query planner fights. Postgres’s query planner doesn’t always pick the right plan for vector-plus-filter queries. The combination of a tight WHERE filter and an ANN index can result in the planner ignoring the index entirely. Workaround: explicit hints, materialised views for hot queries, or recall-vs-performance tuning of the HNSW parameters. None of these is hard, but they’re the kind of thing you don’t discover until you have real traffic.
  • Pinecone — vendor lock-in is real. Pinecone has no self-hosted option. If pricing changes, terms shift, or your compliance posture changes, you’re re-embedding into another store. Build with a vector-store-agnostic framework (LangChain, LlamaIndex) and you’ll cut the migration cost in half — but you’ll still pay it.
  • Qdrant / Weaviate / Milvus self-hosted — the ops you didn’t budget for. These are distributed databases. Backups, replication, version upgrades, disk pressure under HNSW rebuilds, the occasional weird memory profile during high-cardinality filtering. If you don’t have someone whose job includes running databases, the managed tier is cheaper than you think.
  • All of them — embedding-model lock-in. Every vector in your store was produced by a specific embedder. Change embedders and every vector is meaningless on the new map. The vector DB doesn’t care which model you used; the consequence is that “switch from text-embedding-3-small to BGE-M3” means re-embedding your entire corpus, regardless of which database you picked. (See Embeddings explained without math for the framing.)
  • All of them — recall ceiling without a reranker. A vector DB returns the closest embeddings, not necessarily the most relevant documents. The closest-versus-most-relevant gap is what rerankers exist to close. No vector DB will save you from a missing reranker on a serious-quality RAG system.
What's next

If you're picking now

Common questions

FAQ

Can I just use Postgres? Do I need a vector database at all?

For the vast majority of teams in 2026, the answer is genuinely yes — just use Postgres with the pgvector extension. It handles tens of millions of vectors comfortably, supports full SQL filtering, integrates with every framework, and costs nothing extra if you already run Postgres. The dedicated vector databases earn their keep at scale, with specific feature needs, or when you want fully-managed convenience — none of which apply to most pilots and many production systems. Default to pgvector; promote when you have a measured reason to.

How do I know when I've outgrown pgvector?

Three signals. (1) Query latency is climbing past your budget at concurrency you actually have — measure with the real query mix, not synthetic benchmarks. (2) Postgres maintenance windows are dominated by vector-index work (HNSW rebuilds during large ingest). (3) You're scaling Postgres specifically for the vector workload — running a separate Postgres instance just for embeddings, paying for instance sizes you wouldn't otherwise need. If any of those is true, evaluate dedicated stores against your actual workload. If none of them is, stay where you are.

What about cloud-native vector stores like Turbopuffer, LanceDB, or vector features in Snowflake / BigQuery?

Three categories worth knowing. Turbopuffer and similar object-storage-backed stores (vectors live in S3-style storage with smart caching) trade a few extra milliseconds of cold latency for dramatically lower cost at scale — interesting for very large, mostly-cold corpora. LanceDB is file-format-based, sits well in data-engineering workflows, and is gaining traction for analytics-adjacent use cases. Snowflake, BigQuery, and other warehouses are adding vector functions; they're fine for joining vectors with your analytical data but currently lag dedicated stores on latency and ergonomics. All three are worth tracking but none has dethroned the matrix above yet.

Does the embedding model lock me to a specific database?

No — vector databases are model-agnostic. Every store accepts a list of numbers (the embedding) and stores it; what produced those numbers is your concern, not theirs. The lock-in goes the other way: your embeddings are tied to your embedder, and changing embedders forces a full re-embedding regardless of which database you used. Pick the embedder by accuracy on your task; pick the database by scale, ops fit, and ecosystem.

What about Elasticsearch / OpenSearch with vector support?

Both have added vector search (dense_vector field, k-NN queries) and they work — particularly well if you're already running Elasticsearch for log search or product search. The trade-off: their vector implementations are competitive but not best-in-class on latency or recall versus dedicated vector stores, and they're operationally heavier than pgvector for teams who don't already run an Elastic cluster. If you have an existing Elasticsearch install and modest vector needs, adding k-NN to it is often the right answer. If you don't have one, don't start one just for vectors.

Sources & references

Change history (1 entry)
  • 2026-05-11 Initial publication.