A vector database is a database that stores embeddings — the way AI represents meaning as numbers — and lets you search by meaning rather than by keyword. You use one when you’ve built retrieval-augmented generation (giving the AI access to your specific documents) or any feature that asks “find me the things most similar to this.” See Embeddings explained without math for the why.
The next question is where to store the embeddings. The market has gone from “one obvious choice” to “six reasonable choices with different trade-offs” in eighteen months, and most of the framing online is written by the vendors themselves. This piece is the calmer version.
The honest one-liner: for most teams in 2026, pgvector on your existing Postgres is the right starting point and you won’t need anything else for the first few million vectors. Everything below is an answer to “when does that stop being true?”
The comparison matrix
| pgvector | Pinecone | Qdrant | Weaviate | Milvus / Zilliz | |
|---|---|---|---|---|---|
| Hosting model | Postgres extension — self-hosted on your existing DB | Managed-only (cloud SaaS) | Both: managed cloud + open-source self-host | Both: managed cloud + open-source self-host | Both: managed (Zilliz Cloud) + open-source self-host |
| Entry price | $0 incremental on existing Postgres | Free Starter tier (pay-as-you-go) · Standard from ~$50/month minimum + usage | Free tier (managed) + open-source self-host | Free sandbox + paid from ~$25/month | Free tier + paid managed from ~$65/month |
| Practical scale ceiling | Tens of millions before tuning, ~100M with HNSW + good Postgres ops | Hundreds of millions to low billions in managed clusters | Hundreds of millions per node; sharded for more | Hundreds of millions per node; replication and sharding for more | Billions — designed for distributed scale-out from day one |
| Hybrid search (vector + keyword) | Yes — combine with built-in Postgres full-text search | Yes — sparse-dense hybrid built in | Yes — built-in BM25 + vector | Yes — built-in BM25 + vector | Yes — built-in scalar + vector queries |
| Metadata filtering | Excellent — full SQL WHERE clauses | Good — typed metadata filters on the API | Good — payload filters on any indexed field | Good — typed property filters | Good — boolean expressions over scalar fields |
| Multi-tenancy | Schema or row-level isolation in Postgres | Native namespaces per index | Native multi-tenancy via collection partitions | Native multi-tenancy (1.x feature) with per-tenant isolation | Partitioned collections |
| Framework ecosystem | Strong — every Python framework has a Postgres adapter | Strongest — first-class in LangChain, LlamaIndex, every framework | Strong — first-class in LangChain, LlamaIndex | Strong — Weaviate-specific integrations + LangChain | Strong — first-class in LangChain, LlamaIndex |
| Ops burden | Lowest if you already run Postgres; you own the upgrade cycle | Lowest overall — fully managed, no servers | Low if managed; medium if self-hosted (run a service) | Low if managed; medium if self-hosted | Highest if self-hosted (distributed system); low on Zilliz Cloud |
| When to pick it | You already use Postgres and the corpus is under ~50M vectors. Default choice for most teams. | You want zero infrastructure and the SLA of a managed service. Willing to pay per-month-per-namespace. | You want managed convenience or self-host flexibility, and value hybrid search ergonomics. | You want built-in vectorisers (the DB embeds for you) or a strong schema-driven model. | You have a billions-scale corpus and ops to run a distributed system, or budget for Zilliz Cloud. |
Chroma is worth a mention as the sixth option — it’s developer-friendly, popular for prototypes and notebooks, and sits at a smaller-scale price point. It’s less production-hardened than the five above and is most often used as the “let me try semantic search this afternoon” tool before promoting to one of the listed five. Lance / LanceDB and Turbopuffer are newer entrants worth watching but haven’t yet accumulated the production track record of the matrix above.
What this costs in practice
The pattern in those numbers is consistent: the vector database is rarely the line item that matters at small-to-mid scale. A team running pgvector on existing Postgres pays nothing extra; a team running Pinecone Builder pays the cost of one developer-hour per month. The migration cost — re-embedding, re-ingesting, porting filters — is what gates the decision once you’ve shipped.
When to leave pgvector
Start on pgvector. Move to a dedicated vector database when one of the following becomes true:
- Scale ceiling. You’ve crossed a hundred million vectors, or you’re spending real money on Postgres instances tuned for vector workloads. Dedicated stores handle this without you having to tune.
- Latency floor. You need sub-50ms p99 latency at high concurrency, and pgvector’s HNSW index is no longer keeping up under your specific query mix. (Benchmark before assuming — pgvector with HNSW is faster than most teams expect.)
- Feature gap. You need something pgvector doesn’t have well — strong built-in re-ranking, hybrid search with first-class BM25, native multi-tenancy with quota enforcement, multi-modal indexes with image-and-text in the same store. Each dedicated DB has one or two of these as a differentiator.
- Ops preference. You don’t want to run Postgres for this workload, period. A managed vector DB is the entire answer; pay the monthly bill and stop thinking about it.
If none of those are true, you don’t have a vector database problem — you have a which-Postgres-extension-to-enable problem, and you’ve already solved it.
Failure modes that bite in production
- pgvector — query planner fights. Postgres’s query planner doesn’t always pick the right plan for vector-plus-filter queries. The combination of a tight WHERE filter and an ANN index can result in the planner ignoring the index entirely. Workaround: explicit hints, materialised views for hot queries, or recall-vs-performance tuning of the HNSW parameters. None of these is hard, but they’re the kind of thing you don’t discover until you have real traffic.
- Pinecone — vendor lock-in is real. Pinecone has no self-hosted option. If pricing changes, terms shift, or your compliance posture changes, you’re re-embedding into another store. Build with a vector-store-agnostic framework (LangChain, LlamaIndex) and you’ll cut the migration cost in half — but you’ll still pay it.
- Qdrant / Weaviate / Milvus self-hosted — the ops you didn’t budget for. These are distributed databases. Backups, replication, version upgrades, disk pressure under HNSW rebuilds, the occasional weird memory profile during high-cardinality filtering. If you don’t have someone whose job includes running databases, the managed tier is cheaper than you think.
- All of them — embedding-model lock-in. Every vector in your store was produced by a specific embedder. Change embedders and every vector is meaningless on the new map. The vector DB doesn’t care which model you used; the consequence is that “switch from text-embedding-3-small to BGE-M3” means re-embedding your entire corpus, regardless of which database you picked. (See Embeddings explained without math for the framing.)
- All of them — recall ceiling without a reranker. A vector DB returns the closest embeddings, not necessarily the most relevant documents. The closest-versus-most-relevant gap is what rerankers exist to close. No vector DB will save you from a missing reranker on a serious-quality RAG system.
If you're picking now
- Start by reading Embeddings explained without math if you haven’t — pick the embedder before the database, because the embedder dictates dimensions and the database’s scale ceiling.
- For the practical “build a private search over our docs” workflow, Build a private knowledge base your team can search walks through the end-to-end pattern using LlamaIndex.
- RAG explained without acronyms is the framing piece that explains why most of these databases exist in the first place.
FAQ
Can I just use Postgres? Do I need a vector database at all?
For the vast majority of teams in 2026, the answer is genuinely yes — just use Postgres with the pgvector extension. It handles tens of millions of vectors comfortably, supports full SQL filtering, integrates with every framework, and costs nothing extra if you already run Postgres. The dedicated vector databases earn their keep at scale, with specific feature needs, or when you want fully-managed convenience — none of which apply to most pilots and many production systems. Default to pgvector; promote when you have a measured reason to.
How do I know when I've outgrown pgvector?
Three signals. (1) Query latency is climbing past your budget at concurrency you actually have — measure with the real query mix, not synthetic benchmarks. (2) Postgres maintenance windows are dominated by vector-index work (HNSW rebuilds during large ingest). (3) You're scaling Postgres specifically for the vector workload — running a separate Postgres instance just for embeddings, paying for instance sizes you wouldn't otherwise need. If any of those is true, evaluate dedicated stores against your actual workload. If none of them is, stay where you are.
What about cloud-native vector stores like Turbopuffer, LanceDB, or vector features in Snowflake / BigQuery?
Three categories worth knowing. Turbopuffer and similar object-storage-backed stores (vectors live in S3-style storage with smart caching) trade a few extra milliseconds of cold latency for dramatically lower cost at scale — interesting for very large, mostly-cold corpora. LanceDB is file-format-based, sits well in data-engineering workflows, and is gaining traction for analytics-adjacent use cases. Snowflake, BigQuery, and other warehouses are adding vector functions; they're fine for joining vectors with your analytical data but currently lag dedicated stores on latency and ergonomics. All three are worth tracking but none has dethroned the matrix above yet.
Does the embedding model lock me to a specific database?
No — vector databases are model-agnostic. Every store accepts a list of numbers (the embedding) and stores it; what produced those numbers is your concern, not theirs. The lock-in goes the other way: your embeddings are tied to your embedder, and changing embedders forces a full re-embedding regardless of which database you used. Pick the embedder by accuracy on your task; pick the database by scale, ops fit, and ecosystem.
What about Elasticsearch / OpenSearch with vector support?
Both have added vector search (dense_vector field, k-NN queries) and they work — particularly well if you're already running Elasticsearch for log search or product search. The trade-off: their vector implementations are competitive but not best-in-class on latency or recall versus dedicated vector stores, and they're operationally heavier than pgvector for teams who don't already run an Elastic cluster. If you have an existing Elasticsearch install and modest vector needs, adding k-NN to it is often the right answer. If you don't have one, don't start one just for vectors.