materialization

20 min read

August 1, 2025

What is Exactly-Once Delivery and Why It’s So Hard to Achieve

Learn what exactly-once delivery guarantees mean for your data and how Estuary implements robust exactly-once semantics.

Team Estuary Estuary Editorial Team

Estuary delivers data using exactly-once semantics

Share this article

Introduction: Why Delivery Guarantees Matter More Than You Think

If you've ever built a system that moves data across services, whether it's a payment processing app, a logistics platform, or just a good old pub/sub pipeline, you've probably run into the idea of message delivery semantics. These are the rules that define how many times a message might show up at its destination.

Sometimes at-least-once is enough. Sometimes only once is non-negotiable.

At the heart of every distributed system is the challenge of getting messages from point A to point B, reliably, correctly, and just once. That’s not a given. Networks fail. Consumers crash. Brokers duplicate. And before you know it, your accounting system just charged a customer twice, or your inventory was updated three times from the same event.

That’s why delivery guarantees like at-most-once, at-least-once, and exactly-once exist. These guarantees aren’t just theoretical. They directly impact the integrity of systems in industries like finance, e-commerce, IoT, and real-time analytics.

In this article, we’ll break down what exactly-once delivery really means, why it’s so hard to achieve in practice, and how modern systems like Estuary Flow make it possible through clever design, transactional protocols, and a bit of engineering grit.

Let’s start with a proper definition.

What Does “Exactly-Once” Actually Mean?

Exactly-once delivery sounds straightforward. A message is sent, and the system delivers it one time. Not zero times. Not twice. Just once.

In reality, it’s anything but simple.

When people talk about exactly-once, they often confuse two related ideas: exactly-once delivery and exactly-once processing. They’re not the same thing.

Exactly-once delivery means the message arrives at its destination just once.
Exactly-once processing means the action triggered by the message only happens one time, even if the message is received more than once.

Here’s an example. Imagine you're transferring $100 from one bank account to another. You don’t just want the message to arrive once. You also want the actual transfer to happen only one time. If it runs twice, you’ve just lost another $100. If it doesn’t run at all, the money stays put. Either way, someone ends up confused or frustrated.

Think of it like mailing a check. You want the post office to deliver it once. You also want the person cashing it to do that only once. If they lose the check, nothing happens. If they deposit it twice, the results could be messy.

That’s why both delivery and processing need to be handled carefully. You can't assume one will guarantee the other. And in distributed systems, where different parts of your architecture can fail in different ways, that makes exactly-once very hard to get right.

Next, let’s compare it to the other types of delivery guarantees and see how they stack up.

Comparing Delivery Guarantees

Distributed systems don't all promise the same thing when it comes to delivering messages. Some systems try to be fast and simple. Others aim to be accurate no matter what. Understanding the differences between the main delivery guarantees helps you make the right choices for your use case.

There are three main types:

At-most-once

Messages are sent, but if something fails, there’s no retry. You might lose the message completely. The upside is speed and simplicity. The downside is that you can’t trust that every message will make it through.

Use this when losing some data is acceptable, like basic logging or non-critical metrics.

At-least-once

Every message is guaranteed to be delivered, but it might happen more than once. Most messaging systems default to this behavior. It’s reliable, but you have to handle duplicates yourself.

Use this when data loss is unacceptable, but you can live with some duplication and clean it up later.

Exactly-once

This is the ideal. Every message arrives one time, and the system ensures that it's processed only once. No losses. No duplicates. It’s the safest option, but it’s also the hardest to implement. It often involves more complexity and performance overhead.

Use this when accuracy is critical, like in financial transactions, inventory systems, or bidding platforms.

Here’s a quick comparison:

Guarantee	Risk of Loss	Risk of Duplication	Complexity	Good For
At-most-once	High	None	Low	Non-critical logs, sensors
At-least-once	None	High	Medium	Notifications, analytics
Exactly-once	None	None	High	Payments, orders, counters

No delivery method is perfect for every use case. What matters is choosing the right level of guarantee for the job.

Next, let’s talk about why exactly-once delivery is so hard to get right.

Why Exactly-Once Delivery is So Challenging

Exactly-once sounds great on paper. In practice, it's one of the hardest guarantees to make and keep in distributed systems.

Let’s break down why.

Things can fail at every step

Messages travel through multiple components: the producer, the network, the broker, the consumer, and sometimes storage. Any one of these can fail or restart. When that happens, systems often retry. But retries can lead to duplicates, and lost acknowledgments can lead to replays.

Duplicates are sneaky

Even when messages are only sent once, retries from network timeouts or consumer restarts can make them appear twice. If the system doesn’t catch and handle that correctly, the same message could be processed more than once.

For example, if a service writes to a database and then crashes before marking the message as complete, it might try again after restarting. That can cause double inserts or payments unless the system is built to detect and ignore duplicates.

Idempotency is not automatic

An idempotent operation is one that has the same effect, no matter how many times it’s run. If your system isn’t idempotent, even a single retry can cause incorrect results. Think of incrementing a counter, sending an invoice, or shipping a product. You don't want those repeated.

Race conditions and concurrency

In a distributed system, multiple instances of the same service might be running at once. If one crashes and another takes over, both might process the same message around the same time. Without coordination, that can lead to duplicate side effects or missed updates.

Some problems can't be solved at all

There are also fundamental limits. The Two Generals Problem shows that two systems can’t always be sure they’re in agreement over an unreliable channel. The FLP theorem proves that in a distributed system, you can’t guarantee both consistency and availability if there's even one faulty node.

These aren’t just academic ideas. They explain why no system can promise exactly-once delivery in every possible failure scenario.

Real-world messaging systems are limited

Many message brokers are stateless and partitioned for speed. That’s great for performance, but it makes coordination harder. Storing the exact state of every message and every consumer across restarts and failures is expensive and complex.

Exactly-once delivery isn’t impossible. But it does require more than just toggling a setting. It demands careful design and trade-offs at every layer. From the message broker to the application logic, every piece needs to cooperate.

Next, let’s look at how modern systems try to pull it off.

How Systems Try to Achieve Exactly-Once

Even though exactly-once is hard, some systems do a pretty good job of getting close. They use a combination of smart techniques to reduce duplication and make message processing safer. Here's how they do it.

Unique message IDs and deduplication

The simplest trick is to give each message a unique ID. When the consumer receives a message, it checks if it has already seen that ID before. If yes, it skips processing. If not, it processes the message and stores the ID for next time.

This works well, but it only helps if the consumer has a reliable way to store and look up past IDs. That can become expensive at scale, especially if the system processes millions of events per hour.

Idempotent processing

Another strategy is to make the processing logic itself idempotent. That means even if the same message is handled twice, it won’t cause a bad result. For example, instead of incrementing a value, you can set it to a specific state like “paid.” Or you can use upserts in a database instead of inserts.

This approach shifts the burden to the application layer, but it’s often necessary for reliability.

Transactions that bundle delivery and processing

Some systems go further and treat delivery and processing as a single, atomic transaction. That means they don't mark a message as delivered until they’ve finished applying the changes it triggers.

Apache Kafka, for example, supports this through exactly-once semantics (EOS). Producers and consumers can take part in a transaction where data is written and consumed together, with a single commit step at the end. If anything fails, the whole transaction is rolled back.

Persistent storage and checkpoints

Systems also rely on durable storage and checkpointing. When a consumer processes a message, it records both the result and the point it reached in the stream. That way, if it crashes, it can resume exactly where it left off — no skipping ahead, no double-processing.

The combination of durable logs, checkpoints, and idempotency is often enough to provide exactly-once behavior under normal conditions.

Real-world examples

Kafka uses producer IDs and transactions to guarantee exactly-once between topics and consumers. But it requires careful setup and adds some latency.
Google Pub/Sub supports at-least-once delivery but offers message deduplication using message IDs, which can get you close to exactly-once in some workflows.
NATS JetStream provides tools for tracking delivery and managing state, but exactly-once depends on how you build your consumer logic.
Estuary Flow uses a transactional materialization protocol with recovery logs, coordinated checkpoints, and support for idempotent apply. It provides exactly-once materialization without requiring users to manually manage transactions or retries.

Now let’s take a deeper look at how Estuary Flow does this, and why its approach stands out.

How Estuary Flow Achieves Exactly-Once Materialization

Estuary Flow is designed to handle exactly-once delivery from the ground up. It does this by combining persistent logs, coordinated checkpoints, and smart connectors that understand how to commit changes safely. Instead of relying on just one trick, Flow uses a complete protocol to keep everything in sync, even across failures.

Here’s how it works.

It all starts with the materialization protocol.

When Flow sends data to an external system, like a data warehouse, database, or cloud store, it doesn’t just push records blindly. It runs a structured process over a long-lived connection between the Flow runtime and a connector driver.

While connected, each transaction happens in three main steps: Acknowledge, Load, and Store.

In the Acknowledge phase, Flow and the connector confirm that the last transaction was successfully committed. This prevents data from being replayed accidentally.
In the Load phase, Flow fetches materialized documents and reduces new documents into them based on key.
In the Store phase, loaded documents are committed to the destination system while Flow commits a checkpoint to its recovery log.

These steps will repeat as long as there’s new data to materialize.

The important part is that Flow treats the view state (what’s written to the destination) and the checkpoint (where it left off in the stream) as a single unit. They either both succeed, or neither does.

That’s the foundation of exactly-once delivery.

Three patterns for real-world systems

Different destinations have different capabilities, so Flow supports multiple patterns to maintain exactly-once guarantees.

1. Remote store is authoritative

In this pattern, the external system, like a SQL database, is in charge of managing both the data and the Flow checkpoint. Flow uses database transactions to update both together. In this case, the remote store must implement a method called fencing to prevent duplicate processes from interfering with each other. This avoids situations where two instances try to write the same data at the same time.

2. Recovery log with non-transactional store

Some systems are non-transactional, such as APIs or a key/value store. In these cases, Flow’s internal recovery log takes over. It tracks checkpoints, implements fencing, and uses retries to make sure data is eventually stored. While this isn’t always fully exactly-once, it still protects against data loss and offers strong guarantees for last-write-wins use cases.

3. Recovery log with idempotent apply

This is a middle ground for stores that don’t have full transaction support but support idempotent applies. Flow writes changes to a stable storage location, like a unique file or blob, and then alerts the store of the file. The store must handle idempotency, applying the file just once. That way, exactly-once is still possible, even without full transaction guarantees.

Delta updates for push-only systems

Some destinations, like APIs or event buses, don’t allow reading existing data. For those cases, Flow uses a delta updates mode. Instead of writing the fully reduced state, it only reduces documents within the current transaction. This lets the downstream system reassemble the full picture later.

Delta updates also match how other systems like Kafka Connect behave. For many use cases, especially with last-write-wins logic, this works just fine.

Why it matters

Most systems talk about exactly-once delivery as a goal. Estuary Flow builds it into the core of how data moves. With transactional materializations, smart recovery logs, and flexible strategies for different destinations, Flow gives you exactly-once guarantees without the usual complexity.

Estuary Flow Is Built on Gazette

Estuary Flow’s exactly-once guarantees are built on top of Gazette, a cloud-native, log-based streaming system. Gazette handles low-level details like message sequencing with UUIDs, transactional commit checkpoints, and fencing to prevent zombie processes from writing outdated data.

Flow builds on this foundation by adding real-time connectors, schema enforcement, and end-to-end materialization logic. The result is a platform that brings exactly-once semantics from message capture all the way to your destination systems, without extra glue code or coordination layers.

Next, we’ll take a look at some of the common pitfalls and practical limitations you should be aware of.

Practical Limitations and Common Pitfalls

Even the best systems can't promise exactly-once delivery in every situation. There are always trade-offs. Knowing where the edges are can help you avoid surprises and build more resilient data pipelines.

It’s often “exactly-once under normal conditions”

Many systems, including Estuary Flow, offer exactly-once behavior when things are working as expected. But no system can fully prevent all edge cases in the face of power loss, split-brain failures, or disk corruption. It’s important to understand the scope of guarantees and test how your system behaves during restarts, crashes, and retries.

Confusing delivery with processing

One of the most common misunderstandings is assuming that just because a message was delivered once, it was also processed once. Or vice versa. But these are separate concerns. Delivery is about transport. Processing is about what you do with the message. If either side breaks, you can still end up with duplication or missing data.

Idempotency bugs

Even if your system supports retries or deduplication, your application logic might not. For example, a service that creates a record every time it sees a message could accidentally insert the same thing twice. Small mistakes in logic, like checking for duplicates after an action instead of before, can quietly break your exactly-once guarantees.

Performance trade-offs

Getting to exactly-once often means giving something up. Tracking checkpoints, handling retries, and verifying commits all add overhead. You might notice higher latency or increased resource usage, especially under load. That’s why some systems default to at-least-once unless you explicitly configure them otherwise.

Poor error handling

Some systems only offer exactly-once if you implement your side correctly. If your consumer crashes before storing a checkpoint, or if you forget to flush buffered data, things can fall out of sync. That’s why it's important to test not just happy paths, but also failure modes like timeouts, disconnects, and out-of-order messages.

Limitations in destination systems

Estuary implements several patterns to provide exactly-once guarantees in as many scenarios as possible. But if your destination system itself doesn’t include strong transaction guarantees or doesn’t handle idempotency well, there’s only so much Estuary can do to shore up those gaps. Connectors with at-least-once semantics rather than exactly-once will be marked explicitly in Estuary’s documentation.

Exactly-once is powerful, but it’s not a silver bullet. To get it right, every part of your system — the broker, the consumer, the destination, and your logic — has to work together.

Next, we’ll talk about when exactly-once is truly necessary and when it might be more than you actually need.

When Do You Really Need Exactly-Once?

Exactly-once delivery sounds great, but in many situations, it's more than you actually need. Before adding the extra complexity, it's worth asking if your use case truly requires it.

When it matters most

There are some cases where exactly-once delivery is not just nice to have. It’s essential. These include:

Financial transactions
You don’t want a customer to be charged twice or not at all. Payments, transfers, and ledger updates need strict guarantees.
Inventory and order systems
If you ship the same product twice or forget to update inventory, it can throw off fulfillment, lead to customer issues, or mess up reporting.
Real-time bidding or auctions
If the same bid is processed twice, or missed entirely, it can directly impact revenue or fairness.
Metering and usage-based billing
Overcounting or undercounting usage can either cost you money or frustrate customers. Either way, it damages trust.

In cases like these, it’s worth the engineering effort to make sure each message is processed exactly once and no side effect is repeated.

When other guarantees are good enough

On the flip side, many applications work just fine with at-least-once or even at-most-once semantics. For example:

Logs and analytics
If a few log lines are duplicated or dropped, no one notices. Dashboards still show the big picture.
Email and notifications
Sending the same alert twice is better than not sending it at all. At-least-once is perfectly acceptable.
Monitoring and metrics
Most metrics systems already deduplicate or smooth out anomalies over time.

In these cases, chasing exactly-once might not be worth it. Instead, focus on making your pipeline reliable, observable, and able to recover gracefully when something goes wrong.

Up next, we’ll cover best practices that can help you design safer systems, even when exactly-once isn’t guaranteed.

Engineering Strategies and Best Practices

Exactly-once delivery is hard, but you can still build systems that behave reliably and avoid surprises. Even if the underlying infrastructure doesn’t guarantee exactly-once, these strategies can help you get close.

Make your processing logic idempotent

Idempotency is one of the most powerful tools in your toolkit. It means your system can safely process the same message more than once without causing duplicate effects.

Examples:

Set a status to "paid" instead of incrementing a balance.
Use upserts instead of inserts when writing to a database.
Check if a message was already processed before taking action.

If your logic is idempotent, even at-least-once delivery becomes much safer to work with.

Use unique identifiers

Give every event, transaction, or message a unique ID. This makes it easy to detect and skip duplicates. Many systems, like Estuary Flow, use these IDs behind the scenes to ensure safe processing.

If you’re building your own logic, consider storing a history of processed IDs in a database or cache, at least for a short time window.

Choose systems that support strong guarantees

Not every part of your stack needs to support exactly-once, but it helps to use components that get you closer. Look for systems that offer:

Checkpointing
Idempotent operations
Transaction support
Built-in deduplication or fencing

Estuary Flow, for example, handles much of this for you by coordinating delivery, checkpointing, and materialization in one flow.

Plan for failures

Design your system to recover cleanly. Expect retries, timeouts, restarts, and out-of-order messages.

Store checkpoints regularly.
Make retries safe.
Monitor your system for missed or duplicated messages.
Add compensating logic if needed, such as reversing a charge or correcting a record.

Keep observability in mind

Logging, metrics, and traces are essential. If something goes wrong, you need to know exactly where and when it happened. Good observability can save hours of debugging and prevent silent data corruption.

These best practices don’t guarantee perfection, but they give you a solid foundation. Combined with a system like Estuary Flow, they help you get close to exactly-once behavior without rebuilding everything from scratch.

Next, we’ll look at how some popular systems handle delivery guarantees in the real world.

Case Studies and Industry Examples

Many well-known data systems claim different delivery guarantees. Some offer full exactly-once support, while others focus on simpler models with tools to help you build reliability on your own.

Here’s how a few of them approach the problem.

Apache Kafka

Kafka supports exactly-once semantics, but only when producers, brokers, and consumers are configured correctly. It uses transactional APIs to bundle message writes and offset commits together. This prevents partial processing and duplicate deliveries, but it requires extra coordination and comes with some performance cost.

Kafka works best when all components are built to participate in the transaction. If not, it falls back to at-least-once delivery.

Google Pub/Sub

By default, Pub/Sub offers at-least-once delivery. Messages might be delivered more than once, especially if acknowledgments are delayed. To help reduce duplicates, you can attach a message ID and deduplicate on the consumer side within a short window.

This gives you something close to exactly-once in practice, but only if your consumer logic is designed for it.

NATS JetStream

JetStream provides more control over message delivery with support for stream replay, acknowledgments, and message deduplication. Like Pub/Sub, it leans toward at-least-once by default. You can get closer to exactly-once by building your own idempotent consumers.

Estuary Flow

Estuary Flow provides exactly-once materialization, meaning data is delivered and committed to the destination only once, even across failures or retries. It does this using:

A persistent recovery log
Coordinated checkpoints and store operations
Fencing and idempotent apply logic
Support for both transactional and non-transactional destinations

Because Flow handles message delivery, change data capture, and materialization in a single system, it reduces the number of places things can go wrong. You don’t have to build your own fencing or deduplication layer. It’s already part of the protocol.

Each of these tools has strengths and trade-offs. The right choice depends on your needs, your scale, and how much complexity you're willing to take on. In the final section, we’ll wrap things up and help you decide how to move forward.

Conclusion

Exactly-once delivery is one of the most desirable guarantees in data engineering. It helps protect systems from overcharging customers, duplicating shipments, or losing key business insights. But as you’ve seen, getting there isn’t easy.

It requires more than just reliable message transport. You need coordination between delivery and processing, careful handling of retries and duplicates, and logic that doesn’t break under pressure. Even then, no system can promise exactly-once behavior in every failure scenario.

That’s why it’s important to step back and ask what level of reliability you actually need. Sometimes at-least-once is good enough. Other times, like in financial systems or real-time bidding, only exactly-once will do.

If you do need strong guarantees, you don’t have to build everything from scratch. Estuary Flow gives you exactly-once materialization out of the box, handling the hard parts like checkpoints, retries, and state coordination for you.

The goal isn’t to chase perfection. It’s to build systems that are predictable, testable, and safe, even when things go wrong.

Now you know what exactly-once really means, why it’s hard, and how to approach it with confidence.