Vector databases compared

Q: Can I just use Postgres? Do I need a vector database at all?

For the vast majority of teams in 2026, the answer is genuinely yes — just use Postgres with the pgvector extension. It handles tens of millions of vectors comfortably, supports full SQL filtering, integrates with every framework, and costs nothing extra if you already run Postgres. The dedicated vector databases earn their keep at scale, with specific feature needs, or when you want fully-managed convenience — none of which apply to most pilots and many production systems. Default to pgvector; promote when you have a measured reason to.

Q: How do I know when I've outgrown pgvector?

Three signals. (1) Query latency is climbing past your budget at concurrency you actually have — measure with the real query mix, not synthetic benchmarks. (2) Postgres maintenance windows are dominated by vector-index work (HNSW rebuilds during large ingest). (3) You're scaling Postgres specifically for the vector workload — running a separate Postgres instance just for embeddings, paying for instance sizes you wouldn't otherwise need. If any of those is true, evaluate dedicated stores against your actual workload. If none of them is, stay where you are.

Q: What about cloud-native vector stores like Turbopuffer, LanceDB, or vector features in Snowflake / BigQuery?

Three categories worth knowing. Turbopuffer and similar object-storage-backed stores (vectors live in S3-style storage with smart caching) trade a few extra milliseconds of cold latency for dramatically lower cost at scale — interesting for very large, mostly-cold corpora. LanceDB is file-format-based, sits well in data-engineering workflows, and is gaining traction for analytics-adjacent use cases. Snowflake , BigQuery , and other warehouses are adding vector functions; they're fine for joining vectors with your analytical data but currently lag dedicated stores on latency and ergonomics. All three are worth tracking but none has dethroned the matrix above yet.

Q: Does the embedding model lock me to a specific database?

No — vector databases are model-agnostic. Every store accepts a list of numbers (the embedding) and stores it; what produced those numbers is your concern, not theirs. The lock-in goes the other way: your embeddings are tied to your embedder , and changing embedders forces a full re-embedding regardless of which database you used. Pick the embedder by accuracy on your task; pick the database by scale, ops fit, and ecosystem.

Q: What about Elasticsearch / OpenSearch with vector support?

Both have added vector search (dense_vector field, k-NN queries) and they work — particularly well if you're already running Elasticsearch for log search or product search. The trade-off: their vector implementations are competitive but not best-in-class on latency or recall versus dedicated vector stores, and they're operationally heavier than pgvector for teams who don't already run an Elastic cluster. If you have an existing Elasticsearch install and modest vector needs, adding k-NN to it is often the right answer. If you don't have one, don't start one just for vectors.

A vector database is a database that stores embeddings — the way AI represents meaning as numbers — and lets you search by meaning rather than by keyword. You use one when you’ve built retrieval-augmented generation (giving the AI access to your specific documents) or any feature that asks “find me the things most similar to this.” See Embeddings explained without math for the why.

The next question is where to store the embeddings. The market has gone from “one obvious choice” to “six reasonable choices with different trade-offs” in eighteen months, and most of the framing online is written by the vendors themselves. This piece is the calmer version.

The honest one-liner: for most teams in 2026, pgvector on your existing Postgres is the right starting point and you won’t need anything else for the first few million vectors. Everything below is an answer to “when does that stop being true?”

Side by side

The comparison matrix

	pgvector	Pinecone	Qdrant	Weaviate	Milvus / Zilliz
Hosting model	Postgres extension — self-hosted on your existing DB	Managed-only (cloud SaaS)	Both: managed cloud + open-source self-host	Both: managed cloud + open-source self-host	Both: managed (Zilliz Cloud) + open-source self-host
Entry price	$0 incremental on existing Postgres	Free Starter tier (pay-as-you-go) · Standard from ~$50/month minimum + usage	Free tier (managed) + open-source self-host	Free sandbox + paid from ~$25/month	Free tier + paid managed from ~$65/month
Practical scale ceiling	Tens of millions before tuning, ~100M with HNSW + good Postgres ops	Hundreds of millions to low billions in managed clusters	Hundreds of millions per node; sharded for more	Hundreds of millions per node; replication and sharding for more	Billions — designed for distributed scale-out from day one
Hybrid search (vector + keyword)	Yes — combine with built-in Postgres full-text search	Yes — sparse-dense hybrid built in	Yes — built-in BM25 + vector	Yes — built-in BM25 + vector	Yes — built-in scalar + vector queries
Metadata filtering	Excellent — full SQL WHERE clauses	Good — typed metadata filters on the API	Good — payload filters on any indexed field	Good — typed property filters	Good — boolean expressions over scalar fields
Multi-tenancy	Schema or row-level isolation in Postgres	Native namespaces per index	Native multi-tenancy via collection partitions	Native multi-tenancy (1.x feature) with per-tenant isolation	Partitioned collections
Framework ecosystem	Strong — every Python framework has a Postgres adapter	Strongest — first-class in LangChain, LlamaIndex, every framework	Strong — first-class in LangChain, LlamaIndex	Strong — Weaviate-specific integrations + LangChain	Strong — first-class in LangChain, LlamaIndex
Ops burden	Lowest if you already run Postgres; you own the upgrade cycle	Lowest overall — fully managed, no servers	Low if managed; medium if self-hosted (run a service)	Low if managed; medium if self-hosted	Highest if self-hosted (distributed system); low on Zilliz Cloud
When to pick it	You already use Postgres and the corpus is under ~50M vectors. Default choice for most teams.	You want zero infrastructure and the SLA of a managed service. Willing to pay per-month-per-namespace.	You want managed convenience or self-host flexibility, and value hybrid search ergonomics.	You want built-in vectorisers (the DB embeds for you) or a strong schema-driven model.	You have a billions-scale corpus and ops to run a distributed system, or budget for Zilliz Cloud.

Chroma is worth a mention as the sixth option — it’s developer-friendly, popular for prototypes and notebooks, and sits at a smaller-scale price point. It’s less production-hardened than the five above and is most often used as the “let me try semantic search this afternoon” tool before promoting to one of the listed five. Lance / LanceDB and Turbopuffer are newer entrants worth watching but haven’t yet accumulated the production track record of the matrix above.

The numbers

What this costs in practice

pgvector — incremental cost on existing Postgres $0 software cost; storage scales with your normal DB plan

Pinecone — Starter (free, pay-as-you-go) $0 base; usage billed per million vector operations + storage

Pinecone — Standard tier (typical production) From ~$50/month minimum + usage; lands $70–$700/month at small-to-mid scale

Qdrant Cloud — free tier 1 GB cluster, sufficient for ~1M vectors at 768 dimensions

Qdrant Cloud — paid From ~$25/month for a small managed cluster

Weaviate Cloud — entry Free 14-day sandbox; paid from ~$25/month for serverless tier

Zilliz Cloud (managed Milvus) — entry Free tier with limited resources; paid from ~$65/month

Self-hosted (Qdrant / Weaviate / Milvus) — base infra $40–$300/month for a small VPS or k8s node; ops time is the real cost

Storage per million 1,536-dim vectors ~6 GB raw; ~2× with HNSW index overhead

Per-query latency (under 10M vectors, well-indexed) < 50 ms typical across all five options

Per-query latency at billion-scale (Milvus / Pinecone) 20–100 ms p99 with proper sharding

Cost of migrating between vector DBs Re-embedding the whole corpus + porting metadata schema; usually 1–3 engineer-days

The pattern in those numbers is consistent: the vector database is rarely the line item that matters at small-to-mid scale. A team running pgvector on existing Postgres pays nothing extra; a team running Pinecone Builder pays the cost of one developer-hour per month. The migration cost — re-embedding, re-ingesting, porting filters — is what gates the decision once you’ve shipped.

The decision rule

When to leave pgvector

Start on pgvector. Move to a dedicated vector database when one of the following becomes true:

Scale ceiling. You’ve crossed a hundred million vectors, or you’re spending real money on Postgres instances tuned for vector workloads. Dedicated stores handle this without you having to tune.
Latency floor. You need sub-50ms p99 latency at high concurrency, and pgvector’s HNSW index is no longer keeping up under your specific query mix. (Benchmark before assuming — pgvector with HNSW is faster than most teams expect.)
Feature gap. You need something pgvector doesn’t have well — strong built-in re-ranking, hybrid search with first-class BM25, native multi-tenancy with quota enforcement, multi-modal indexes with image-and-text in the same store. Each dedicated DB has one or two of these as a differentiator.
Ops preference. You don’t want to run Postgres for this workload, period. A managed vector DB is the entire answer; pay the monthly bill and stop thinking about it.

If none of those are true, you don’t have a vector database problem — you have a which-Postgres-extension-to-enable problem, and you’ve already solved it.

Where each fails

Failure modes that bite in production

pgvector — query planner fights. Postgres’s query planner doesn’t always pick the right plan for vector-plus-filter queries. The combination of a tight WHERE filter and an ANN index can result in the planner ignoring the index entirely. Workaround: explicit hints, materialised views for hot queries, or recall-vs-performance tuning of the HNSW parameters. None of these is hard, but they’re the kind of thing you don’t discover until you have real traffic.
Pinecone — vendor lock-in is real. Pinecone has no self-hosted option. If pricing changes, terms shift, or your compliance posture changes, you’re re-embedding into another store. Build with a vector-store-agnostic framework (LangChain, LlamaIndex) and you’ll cut the migration cost in half — but you’ll still pay it.
Qdrant / Weaviate / Milvus self-hosted — the ops you didn’t budget for. These are distributed databases. Backups, replication, version upgrades, disk pressure under HNSW rebuilds, the occasional weird memory profile during high-cardinality filtering. If you don’t have someone whose job includes running databases, the managed tier is cheaper than you think.
All of them — embedding-model lock-in. Every vector in your store was produced by a specific embedder. Change embedders and every vector is meaningless on the new map. The vector DB doesn’t care which model you used; the consequence is that “switch from text-embedding-3-small to BGE-M3” means re-embedding your entire corpus, regardless of which database you picked. (See Embeddings explained without math for the framing.)
All of them — recall ceiling without a reranker. A vector DB returns the closest embeddings, not necessarily the most relevant documents. The closest-versus-most-relevant gap is what rerankers exist to close. No vector DB will save you from a missing reranker on a serious-quality RAG system.

What's next

If you're picking now

Start by reading Embeddings explained without math if you haven’t — pick the embedder before the database, because the embedder dictates dimensions and the database’s scale ceiling.
For the practical “build a private search over our docs” workflow, Build a private knowledge base your team can search walks through the end-to-end pattern using LlamaIndex.
RAG explained without acronyms is the framing piece that explains why most of these databases exist in the first place.

Common questions

FAQ

Can I just use Postgres? Do I need a vector database at all?

For the vast majority of teams in 2026, the answer is genuinely yes — just use Postgres with the pgvector extension. It handles tens of millions of vectors comfortably, supports full SQL filtering, integrates with every framework, and costs nothing extra if you already run Postgres. The dedicated vector databases earn their keep at scale, with specific feature needs, or when you want fully-managed convenience — none of which apply to most pilots and many production systems. Default to pgvector; promote when you have a measured reason to.

How do I know when I've outgrown pgvector?

Three signals. (1) Query latency is climbing past your budget at concurrency you actually have — measure with the real query mix, not synthetic benchmarks. (2) Postgres maintenance windows are dominated by vector-index work (HNSW rebuilds during large ingest). (3) You're scaling Postgres specifically for the vector workload — running a separate Postgres instance just for embeddings, paying for instance sizes you wouldn't otherwise need. If any of those is true, evaluate dedicated stores against your actual workload. If none of them is, stay where you are.

What about cloud-native vector stores like Turbopuffer, LanceDB, or vector features in Snowflake / BigQuery?

Three categories worth knowing. Turbopuffer and similar object-storage-backed stores (vectors live in S3-style storage with smart caching) trade a few extra milliseconds of cold latency for dramatically lower cost at scale — interesting for very large, mostly-cold corpora. LanceDB is file-format-based, sits well in data-engineering workflows, and is gaining traction for analytics-adjacent use cases. Snowflake, BigQuery, and other warehouses are adding vector functions; they're fine for joining vectors with your analytical data but currently lag dedicated stores on latency and ergonomics. All three are worth tracking but none has dethroned the matrix above yet.

Does the embedding model lock me to a specific database?

No — vector databases are model-agnostic. Every store accepts a list of numbers (the embedding) and stores it; what produced those numbers is your concern, not theirs. The lock-in goes the other way: your embeddings are tied to your embedder, and changing embedders forces a full re-embedding regardless of which database you used. Pick the embedder by accuracy on your task; pick the database by scale, ops fit, and ecosystem.

What about Elasticsearch / OpenSearch with vector support?

Both have added vector search (dense_vector field, k-NN queries) and they work — particularly well if you're already running Elasticsearch for log search or product search. The trade-off: their vector implementations are competitive but not best-in-class on latency or recall versus dedicated vector stores, and they're operationally heavier than pgvector for teams who don't already run an Elastic cluster. If you have an existing Elasticsearch install and modest vector needs, adding k-NN to it is often the right answer. If you don't have one, don't start one just for vectors.

The comparison matrix

What this costs in practice

When to leave pgvector

Failure modes that bite in production

If you're picking now

FAQ

Can I just use Postgres? Do I need a vector database at all?

How do I know when I've outgrown pgvector?

What about cloud-native vector stores like Turbopuffer, LanceDB, or vector features in Snowflake / BigQuery?

Does the embedding model lock me to a specific database?

What about Elasticsearch / OpenSearch with vector support?

Sources & references

Related solutions

Audit-trail generation from system logs

Auto-categorize support tickets by topic and urgency

Auto-generate documentation from PRs and code

Automated invoice and receipt processing