Estuary

The Vector Database Hype is Over (and That's Good)

The vector database era is fading as Postgres and PGVector absorb the capability. Learn why co-locating vectors with your primary database reduces cost and complexity, plus data engineering practices to improve AI retrieval quality

Blog post hero image
Share this article

There was a time when teams reached for dedicated vector databases. Those purpose-built stores solved the immediate pain of nearest-neighbor search and, with managed offerings, let teams move fast without owning complex search infrastructure. Today, many of those capabilities live inside general-purpose systems such as Postgres with vector extensions like PGVector.

That shift is healthy. Commoditization lowers cost, reduces operational surface area, and makes it easier to experiment. The early vendors did important work: they proved the need and bought teams time to iterate.

The technical trend is familiar. New workloads spawn point products, and then common databases absorb the capability. We saw it with JSON storage and time-series extensions; vector indexing and ANN acceleration are following the same arc.

Dedicated vector stores still have a place when you need sub-millisecond retrieval at global scale or specialized quantization and hardware. For many product teams, however, the practical combination is Postgres for storage, PGVector for similarity, and a dependable pipeline that keeps embeddings current.

It helps to be clear about what vectors do and do not solve. Vectors help you match the right chunks of information to a model prompt. They do not guarantee freshness, provenance, deduplication, or coherent assembly of context from multiple sources. Those are data-engineering problems.

In my experience, co-locating vectors with your primary database can reduce egress, simplify backups, and lower operational complexity. Benchmark results depend on dataset size, query mix, and compute configuration, so measure on a realistic workload rather than trusting headline numbers.

So what should teams do? Measure the right things: recall for your workload, tail latency at expected QPS, and total cost of ownership that includes egress and operational overhead. Run a short experiment with PGVector in a staging copy of your Postgres cluster to validate performance and cost. Then prioritize the plumbing.

Concrete practices that pay off:

  • Use change data capture (CDC) so you re-embed only when content actually changes.
  • Record the embedding model version and source timestamps with each vector to enable safe, incremental model upgrades.
  • Canonicalize documents and compute content hashes to deduplicate before indexing.
  • Choose a chunking strategy that balances recall and vector count based on your query patterns.

Measure retrieval quality continuously. Track recall and precision alongside tail latency, and keep a small human-labeled test set so numeric metrics map to user value. Instrument the pipeline so you can answer why a particular chunk was returned: which source produced it, which ingest run wrote it, and which embedding model generated it.

If you need to migrate an existing managed index, do it pragmatically. Export IDs and vectors, map them to primary keys in Postgres, and run an A/B or shadow test to compare retrieval quality and cost. For many teams, that process confirms co-location reduces egress and simplifies access controls. If experiments still show a need for specialized storage or hardware, you will have reached that conclusion with evidence.

I've also seen gains from combining vectors with structure, for example knowledge graphs paired with vector retrieval. Those improvements come from modeling relationships and improving data hygiene rather than mysterious differences in index internals.

This is a shift away from optimizing infrastructure as a proxy for product decisions. Lower infrastructure cost and fewer moving parts free engineers to work on fresher context, less duplication, and clearer provenance. These are the targets that actually improve user-facing behavior.

At Estuary, we focus on those plumbing problems because they improve retrieval regardless of what index you use.

My take: the vector database chapter was a necessary one. We learned how to build retrieval-augmented (RAG) systems quickly. Now, the harder task of making context reliable begins, and that’s where the next wave of AI applications will be won.

Start streaming your data for free

Build a Pipeline
Share this article
Summarize this page with AI
Start Building For Free

About the author

Picture of David Yaffe
David YaffeCo-founder and CEO

David Yaffe is a co-founder and the CEO of Estuary. He previously served as the COO of LiveRamp and the co-founder / CEO of Arbor which was sold to LiveRamp in 2016. He has an extensive background in product management, serving as head of product for Doubleclick Bid Manager and Invite Media.

Related Articles

Popular Articles

Streaming Pipelines.
Simple to Deploy.
Simply Priced.
$0.50/GB of data moved + $.14/connector/hour;
50% less than competing ETL/ELT solutions;
<100ms latency on streaming sinks/sources.