
This post is contributed by our partners at CorrDyn. Learn more about their work at https://www.corrdyn.com
Data engineering is challenging. It demands fluency across a breadth of domains: networking, database internals, cloud architecture, system administration, API quirks, software engineering, testing, CI/CD. And cutting across them all are security and compliance. All of this knowledge needs to stay fresh while the landscape shifts beneath your feet.
The trivia fades faster than the instincts do. You retain the sensibility to know when something smells wrong, but the specifics fade over time. When breaches happen, how often is it because a detail was missed? The devil is, after all, in the details.
It would be great to have a tool that abstracts away these details. Something that hits a sweet spot of being opinionated while remaining flexible. And preferably something that doesn’t come with an unpredictable cost factor.
In this post, we'll walk through Estuary's deployment models, pricing, and core capabilities, then share two real migration stories: one where we wished we had it, one where we did. By the end, you'll know whether Estuary fits your data integration needs.
The Pitch
A right-time (as opposed to only real-time or only batch) data integration platform with a large library of config driven connectors, flexible deployment options, and pricing that is predictable and easy to understand.
Right-time means delivering data when it’s most valuable to the business; fresh enough to act on, but not faster or more complex than necessary. Estuary is right-time because it continuously captures data once and delivers it with the right latency, consistency, and delivery model for each destination and use case.
Sounds compelling. Let’s dig in a little.
Deployment flexibility. This has been a big deal for our clients. Three tiers, each with tradeoffs:
Public: Fully managed, fast to spin up, cost-effective. Some flexibility on data residency. The easy button.
Private: You still get managed infrastructure, but your data stays in your network via PrivateLink. Full control over residency, option for cross-region paths.
BYOC (Bring Your Own Cloud): Complete control. Run it in your VPC, benefit from cloud incentives and your negotiated rates, leverage capabilities afforded by your provider. Since data stays within your network, you also avoid the egress fees that add up fast at scale. Estuary still handles deployment and updates in a largely automated fashion once you have configured things and granted appropriate permissions.
BYOC is especially valuable for organizations subject to GDPR, CCPA, HIPAA, SOX, or PCI-DSS. You get the operational benefits of a managed offering while retaining control over the elements that compliance requires: data residency, network isolation, encryption keys, access control, and audit logs.
For example, we went BYOC for a client project on Azure. Their compliance model required no data movement across the internet or within vendor systems, and all data resides within their country. We targeted an Azure region in their country of residence and were able to configure the data plane and a PrivateLink to journal storage within a few hours, and it has been basically hands off since.
Containerized connector logic. Easy to configure, portable to your cloud, and tailored to the relevant systems to abstract away complexity. You can see Estuary’s list of available connectors here. In the near future Estuary will support custom developed connectors in private deployments and BYOC, enabling new capabilities for bespoke sources or internal systems. We’ve partnered with Estuary to develop new connectors for the public several times, and their development kit is very efficient to develop new connectors.
One-to-many writes. A killer feature. Pull from a source once, materialize to as many destinations as you need. No duplicate captures or orchestration headaches. The data flows where it needs to go as fast as the source and destination endpoints can handle.
Dekaf. If your enterprise leverages tools that consume from Kafka (analytics platforms like Tinybird or ClickHouse, real-time databases like SingleStore or Materialize, or internal services built on Kafka consumers), Dekaf provides a Kafka API compliant interface over the data brought in by Estuary without needing to run Kafka infrastructure.
Each dataset managed by Estuary, called a collection, appears as a topic to Kafka consumers. Point any consumer at Dekaf, authenticate with your Estuary credentials, and start consuming events and triggering workflows. No brokers to worry over, no partitions to rebalance, no KRaft clusters or ZooKeeper to babysit.
Straightforward pricing. Metered by data volume and compute. Not by row count, number of endpoints, or some needlessly complex licensing scheme. Very refreshing.
A Migration Where We Wish We'd Had Estuary
Story time. We recently completed a 3.6TB migration from Amazon Aurora PostgreSQL to GCP AlloyDB (4.2 billion rows across a handful of small tables and several monsters with extremely lopsided distribution). The client's compliance requirements dictated that all data had to traverse private, encrypted connections with no exposure to the public internet.
Initially we tried GCP's Database Migration Service, a tool tailor-made for this use case. After six weeks of back and forth with support we were unable to make any headway so we pivoted to a custom approach:
Private networking across clouds. We stood up a point-to-point VPN between AWS and GCP. With Estuary's BYOC deployment and PrivateLink support, this connectivity would have been handled within established patterns rather than bespoke infrastructure.
Juggling batch and incremental strategies. Different tables at different stages required different techniques (COPY versus merge tables, pg_dump and pg_restore for schema versus data, huge batch for initial load versus ongoing incremental sync). Estuary's PostgreSQL connector handles this transition transparently: backfill the historical data, then seamlessly shift to real-time change capture. One configuration, not a decision matrix.
Validation complexity. Our validation approaches differed between initial bulk loads and incremental sync, requiring custom logic for each phase. Estuary handles baseline schema validation automatically. For structured data sources it examines the table schemas and generates strict JSON schema definitions that it uses to validate every row on every read and every write. If validation fails Estuary isolates the offending row for review, fires an alert, and stops the sync task until you resolve the issue so bad data does not propagate.
Custom tooling. We ended up with a chunky VM, a whole heap of bespoke Python, shell scripts, and tmux sessions. Effective, but work that shouldn't need to exist. With Estuary, the connector handles PostgreSQL-specific nuances while we focus on the business logic.
Could we have used Estuary? Unfortunately not, in this case, for reasons outside our control. But we definitely wanted to. The BYOC deployment model fit our compliance requirements perfectly: data stays in the client's VPC, traverses only private networks, and remains under their access controls. The project would have taken a few hours of time, and we wouldn't have been writing custom chunking logic in the wee hours of the night.
A Migration Where We Had Estuary
Our recent project for a client in the healthcare benefits space provided the perfect proving ground for Estuary’s capabilities. The client handles massive volumes of inbound claims (~2.5M new claims per week) that move through a complex lifecycle of validation, work, and finalization. Their primary challenge was a lack of visibility: because their legacy SQL Server only reflected current states, they couldn’t track how a claim’s status changed over time or perform any meaningful "time travel" to diagnose historical processing spikes.
Implementing Estuary was remarkably straightforward. We configured Change Data Capture (CDC) to stream data from their production SQL Server through Estuary and materialized into MotherDuck for transformation and reporting. Despite the inherent complexities of their data model (i.e., linking claims to shifting member benefit configurations and complex "carve-in/carve-out" logic), the technical setup was surprisingly fast.
The value was immediate. By leveraging Estuary’s CDC capabilities, we were able to:
- Capture History Automatically: We moved from a static "current-state" view to a full history of status changes, allowing analysts to see exactly how and when claims "self-heal" from a broken state to an accepted one.
- Offload Production Pressure: By moving reporting workloads to a dedicated warehouse, we significantly reduced the strain on their primary SQL Server, which frequently deals with "fast and furious" batch processing loads.
- Enable Granular Slicing: The pipeline easily surfaces the dimensions analysts need, enabling near real-time anomaly detection that was previously impossible.
For the client, the transition from "flying blind" on historical trends to having a functional, time-aware dashboard happened in a fraction of the time a custom engineering effort would have required. It turned a high-stakes data engineering problem into a configuration exercise, allowing the team to focus on resolving business-critical claims issues rather than babysitting ETL scripts and performing surgery on the SQL Server.
These Are a Few of Our Favorite Things
Control, level-set to your needs. Public deployments are easy and low overhead. BYOC deployments, on the other end of the spectrum, afford you all the control you might need. Regardless, your data (organized into collections, stored as append-only logs called journals) lives in your blob storage, governed by your access controls and lifecycle policies.
Data plane where you need it. For organizations with global footprints, Estuary supports multiple data planes under a single control plane. Run one in the US for North American operations, another in Europe for GDPR-scoped data, and a third in Australia. Each data plane operates independently with full data isolation, while the control plane provides unified orchestration and monitoring across all of them.
Connector quality. The library is extensive and the connectors are good. They handle system-specific quirks that would otherwise eat hours of debugging. Time-to-value trends toward now.
Designed for speed and efficiency. Capture once, materialize many times. Backfills for new materializations read from your object storage, so no need to refetch from the source. Journals are append-only logs (in a compressed JSON format with understandable naming conventions) so backfills for new materializations don’t compete with writes made by ongoing captures as your sources get new data.
Config driven (with escape hatches). Need to enrich or transform data from multiple sources before it lands somewhere? Derivations let you do that without the orchestration overhead. No Airflow DAGs or dbt pipelines. Write data wherever you need to by creating materializations. Implement custom validation criteria and automated tests via config files, but bail out and provide custom, stateful Python, Typescript, or SQLite code if needed.
CLI support. Data flows can be fully configured via Estuary’s web interface, via the flowctl CLI tool, or a mix of both. A common workflow is to start in Estuary’s online point-and-click interface, get everything set up, and then use flowctl catalog pull-specs to export config files. Implement custom logic, perform testing locally, track in version control, and integrate the specs via CI/CD.
Real-time CDC with built-in historical context. For database sources, Estuary captures every change (inserts, updates, and deletes) as it happens and stores the complete history in your blob storage. This architecture makes it straightforward to support real-time reporting while maintaining full auditability.
Historical replay and SCD support. Need to backfill a new destination or query historical states? Collections replay from any point in time without touching the source. Type 2 Slowly Changing Dimensions (SCD)**? Capture once from the source, then materialize to as many endpoints as needed with full history tracking. No separate transformation logic per destination. No re-extraction. Just millisecond-latency data movement with the complete lineage you need for audits and compliance.
Secrets management. Estuary encrypts known secrets like API keys automatically regardless of how you are configuring them; tagging custom config fields for encryption is straightforward. Optionally integrate with your preferred KMS (GCP KMS, AWS KMS, or Azure Key Vault) for strict control of your keys.
Collaborative partnership. During one of our Azure private deployments, we encountered a blocking issue at the edge of Estuary's Azure support, due to an undocumented limitation in Azure throwing an Internal Server Error. Rather than filing a ticket with both Estuary and Azure and waiting days or weeks, we worked directly with the Estuary engineering team to diagnose the problem and ship a fix that works for all customers in the future. That's the kind of partnership we value.
The Cons
Every tool has tradeoffs. Here's what we've experienced:
The learning curve exists. It's not huge, though. The concepts (collections, captures, materializations, derivations, etc) are well-documented and make a lot of sense. It is not a substitute for data engineering skill and experience, but it is a powerful piece of leverage in skilled hands.
Azure support is maturing. We've deployed multiple private deployments and BYOC on Azure with the Estuary team and it's been solid. We have encountered gaps that have mostly turned out to be Azure issues rather than Estuary limitations, and as noted above, we worked through it together.
Who Should Care
If you're moving significant data volumes and any of the following apply, Estuary deserves a look:
- You need to pull from multiple SaaS products and combine with on-prem or cloud data sources
- Your compliance requirements demand private networking and data residency control
- Your team is small and can't afford another pile of custom code to maintain
- Batch processing doesn’t fit your use case any more and you need something closer to real-time
- You need change-data-capture from a database (on-prem or cloud)
- You've been burned by non-unit cost oriented pricing models and want something predictable
The Bottom Line
Estuary doesn't solve every data engineering problem. But it solves a specific, painful class of problems very well: getting data from many places to many other places quickly, efficiently, securely, and without requiring you to become an expert in the quirks of every source and destination system.
The deployment flexibility is a big deal. The connector library quality is solid, and open source; we’ve even contributed to it here at CorrDyn. The pricing model is sane. And when we needed help, the folks at Estuary were great to work with.
For the right use case, it's very good. It's fast, too - 8 GB/s throughput with opportunities to boost that significantly. Basically, as real-time as the source systems you're pulling from.
The wrong data integration decision costs months and six figures in wasted investment. We help teams avoid that by bringing pattern recognition from hundreds of implementations: what works under HIPAA, when BYOC makes financial sense, how to transition from batch to real-time without breaking production. If you're facing one of these decisions, we've done the hard migrations so you don't have to. Let's talk.
CorrDyn specializes in cloud infrastructure, data engineering, and machine learning for organizations that need to move fast without breaking compliance. We've done the hard migrations so you don't have to.
**Type 2 SCD: tracking how records change over time by creating new rows for each version
FAQs
When should a team choose Estuary’s BYOC deployment model?
How is Estuary different from traditional ETL or CDC tools?

Authors












