
Introduction: Why Delivery Guarantees Matter More Than You Think
If you've ever built a system that moves data across services, whether it's a payment processing app, a logistics platform, or just a good old pub/sub pipeline, you've probably run into the idea of message delivery semantics. These are the rules that define how many times a message might show up at its destination.
Sometimes at-least-once is enough. Sometimes only once is non-negotiable.
At the heart of every distributed system is the challenge of getting messages from point A to point B, reliably, correctly, and just once. That’s not a given. Networks fail. Consumers crash. Brokers duplicate. And before you know it, your accounting system just charged a customer twice, or your inventory was updated three times from the same event.
That’s why delivery guarantees like at-most-once, at-least-once, and exactly-once exist. These guarantees aren’t just theoretical. They directly impact the integrity of systems in industries like finance, e-commerce, IoT, and real-time analytics.
In this article, we’ll break down what exactly-once delivery really means, why it’s so hard to achieve in practice, and how modern systems like Estuary Flow make it possible through clever design, transactional protocols, and a bit of engineering grit.
Let’s start with a proper definition.
What Does “Exactly-Once” Actually Mean?
Exactly-once delivery sounds straightforward. A message is sent, and the system delivers it one time. Not zero times. Not twice. Just once.
In reality, it’s anything but simple.
When people talk about exactly-once, they often confuse two related ideas: exactly-once delivery and exactly-once processing. They’re not the same thing.
- Exactly-once delivery means the message arrives at its destination just once.
- Exactly-once processing means the action triggered by the message only happens one time, even if the message is received more than once.
Here’s an example. Imagine you're transferring $100 from one bank account to another. You don’t just want the message to arrive once. You also want the actual transfer to happen only one time. If it runs twice, you’ve just lost another $100. If it doesn’t run at all, the money stays put. Either way, someone ends up confused or frustrated.
Think of it like mailing a check. You want the post office to deliver it once. You also want the person cashing it to do that only once. If they lose the check, nothing happens. If they deposit it twice, the results could be messy.
That’s why both delivery and processing need to be handled carefully. You can't assume one will guarantee the other. And in distributed systems, where different parts of your architecture can fail in different ways, that makes exactly-once very hard to get right.
Next, let’s compare it to the other types of delivery guarantees and see how they stack up.
Comparing Delivery Guarantees
Distributed systems don't all promise the same thing when it comes to delivering messages. Some systems try to be fast and simple. Others aim to be accurate no matter what. Understanding the differences between the main delivery guarantees helps you make the right choices for your use case.
There are three main types:
At-most-once
Messages are sent, but if something fails, there’s no retry. You might lose the message completely. The upside is speed and simplicity. The downside is that you can’t trust that every message will make it through.
Use this when losing some data is acceptable, like basic logging or non-critical metrics.
At-least-once
Every message is guaranteed to be delivered, but it might happen more than once. Most messaging systems default to this behavior. It’s reliable, but you have to handle duplicates yourself.
Use this when data loss is unacceptable, but you can live with some duplication and clean it up later.
Exactly-once
This is the ideal. Every message arrives one time, and the system ensures that it's processed only once. No losses. No duplicates. It’s the safest option, but it’s also the hardest to implement. It often involves more complexity and performance overhead.
Use this when accuracy is critical, like in financial transactions, inventory systems, or bidding platforms.
Here’s a quick comparison:
Guarantee | Risk of Loss | Risk of Duplication | Complexity | Good For |
At-most-once | High | None | Low | Non-critical logs, sensors |
At-least-once | None | High | Medium | Notifications, analytics |
Exactly-once | None | None | High | Payments, orders, counters |
No delivery method is perfect for every use case. What matters is choosing the right level of guarantee for the job.
Next, let’s talk about why exactly-once delivery is so hard to get right.
Why Exactly-Once Delivery is So Challenging
Exactly-once sounds great on paper. In practice, it's one of the hardest guarantees to make and keep in distributed systems.
Let’s break down why.
Things can fail at every step
Messages travel through multiple components: the producer, the network, the broker, the consumer, and sometimes storage. Any one of these can fail or restart. When that happens, systems often retry. But retries can lead to duplicates, and lost acknowledgments can lead to replays.
Duplicates are sneaky
Even when messages are only sent once, retries from network timeouts or consumer restarts can make them appear twice. If the system doesn’t catch and handle that correctly, the same message could be processed more than once.
For example, if a service writes to a database and then crashes before marking the message as complete, it might try again after restarting. That can cause double inserts or payments unless the system is built to detect and ignore duplicates.
Idempotency is not automatic
An idempotent operation is one that has the same effect, no matter how many times it’s run. If your system isn’t idempotent, even a single retry can cause incorrect results. Think of incrementing a counter, sending an invoice, or shipping a product. You don't want those repeated.
Race conditions and concurrency
In a distributed system, multiple instances of the same service might be running at once. If one crashes and another takes over, both might process the same message around the same time. Without coordination, that can lead to duplicate side effects or missed updates.
Some problems can't be solved at all
There are also fundamental limits. The Two Generals Problem shows that two systems can’t always be sure they’re in agreement over an unreliable channel. The FLP theorem proves that in a distributed system, you can’t guarantee both consistency and availability if there's even one faulty node.
These aren’t just academic ideas. They explain why no system can promise exactly-once delivery in every possible failure scenario.
Real-world messaging systems are limited
Many message brokers are stateless and partitioned for speed. That’s great for performance, but it makes coordination harder. Storing the exact state of every message and every consumer across restarts and failures is expensive and complex.
Exactly-once delivery isn’t impossible. But it does require more than just toggling a setting. It demands careful design and trade-offs at every layer. From the message broker to the application logic, every piece needs to cooperate.
Next, let’s look at how modern systems try to pull it off.
How Systems Try to Achieve Exactly-Once
Even though exactly-once is hard, some systems do a pretty good job of getting close. They use a combination of smart techniques to reduce duplication and make message processing safer. Here's how they do it.
Unique message IDs and deduplication
The simplest trick is to give each message a unique ID. When the consumer receives a message, it checks if it has already seen that ID before. If yes, it skips processing. If not, it processes the message and stores the ID for next time.
This works well, but it only helps if the consumer has a reliable way to store and look up past IDs. That can become expensive at scale, especially if the system processes millions of events per hour.
Idempotent processing
Another strategy is to make the processing logic itself idempotent. That means even if the same message is handled twice, it won’t cause a bad result. For example, instead of incrementing a value, you can set it to a specific state like “paid.” Or you can use upserts in a database instead of inserts.
This approach shifts the burden to the application layer, but it’s often necessary for reliability.
Transactions that bundle delivery and processing
Some systems go further and treat delivery and processing as a single, atomic transaction. That means they don't mark a message as delivered until they’ve finished applying the changes it triggers.
Apache Kafka, for example, supports this through exactly-once semantics (EOS). Producers and consumers can take part in a transaction where data is written and consumed together, with a single commit step at the end. If anything fails, the whole transaction is rolled back.
Persistent storage and checkpoints
Systems also rely on durable storage and checkpointing. When a consumer processes a message, it records both the result and the point it reached in the stream. That way, if it crashes, it can resume exactly where it left off — no skipping ahead, no double-processing.
The combination of durable logs, checkpoints, and idempotency is often enough to provide exactly-once behavior under normal conditions.
Real-world examples
- Kafka uses producer IDs and transactions to guarantee exactly-once between topics and consumers. But it requires careful setup and adds some latency.
- Google Pub/Sub supports at-least-once delivery but offers message deduplication using message IDs, which can get you close to exactly-once in some workflows.
- NATS JetStream provides tools for tracking delivery and managing state, but exactly-once depends on how you build your consumer logic.
- Estuary Flow uses a transactional materialization protocol with recovery logs, coordinated checkpoints, and support for idempotent apply. It provides exactly-once materialization without requiring users to manually manage transactions or retries.
Now let’s take a deeper look at how Estuary Flow does this, and why its approach stands out.
How Estuary Flow Achieves Exactly-Once Materialization
Estuary Flow is designed to handle exactly-once delivery from the ground up. It does this by combining persistent logs, coordinated checkpoints, and smart connectors that understand how to commit changes safely. Instead of relying on just one trick, Flow uses a complete protocol to keep everything in sync, even across failures.
Here’s how it works.
It all starts with the materialization protocol.
When Flow sends data to an external system, like a data warehouse, database, or cloud store, it doesn’t just push records blindly. It runs a structured process over a long-lived connection between the Flow runtime and a connector driver.
While connected, each transaction happens in three main steps: Acknowledge, Load, and Store.
- In the Acknowledge phase, Flow and the connector confirm that the last transaction was successfully committed. This prevents data from being replayed accidentally.
- In the Load phase, Flow fetches materialized documents and reduces new documents into them based on key.
- In the Store phase, loaded documents are committed to the destination system while Flow commits a checkpoint to its recovery log.
These steps will repeat as long as there’s new data to materialize.
The important part is that Flow treats the view state (what’s written to the destination) and the checkpoint (where it left off in the stream) as a single unit. They either both succeed, or neither does.
That’s the foundation of exactly-once delivery.
Three patterns for real-world systems
Different destinations have different capabilities, so Flow supports multiple patterns to maintain exactly-once guarantees.
1. Remote store is authoritative
In this pattern, the external system, like a SQL database, is in charge of managing both the data and the Flow checkpoint. Flow uses database transactions to update both together. In this case, the remote store must implement a method called fencing to prevent duplicate processes from interfering with each other. This avoids situations where two instances try to write the same data at the same time.
2. Recovery log with non-transactional store
Some systems are non-transactional, such as APIs or a key/value store. In these cases, Flow’s internal recovery log takes over. It tracks checkpoints, implements fencing, and uses retries to make sure data is eventually stored. While this isn’t always fully exactly-once, it still protects against data loss and offers strong guarantees for last-write-wins use cases.
3. Recovery log with idempotent apply
This is a middle ground for stores that don’t have full transaction support but support idempotent applies. Flow writes changes to a stable storage location, like a unique file or blob, and then alerts the store of the file. The store must handle idempotency, applying the file just once. That way, exactly-once is still possible, even without full transaction guarantees.
Delta updates for push-only systems
Some destinations, like APIs or event buses, don’t allow reading existing data. For those cases, Flow uses a delta updates mode. Instead of writing the fully reduced state, it only reduces documents within the current transaction. This lets the downstream system reassemble the full picture later.
Delta updates also match how other systems like Kafka Connect behave. For many use cases, especially with last-write-wins logic, this works just fine.
Why it matters
Most systems talk about exactly-once delivery as a goal. Estuary Flow builds it into the core of how data moves. With transactional materializations, smart recovery logs, and flexible strategies for different destinations, Flow gives you exactly-once guarantees without the usual complexity.
Estuary Flow Is Built on Gazette
Estuary Flow’s exactly-once guarantees are built on top of Gazette, a cloud-native, log-based streaming system. Gazette handles low-level details like message sequencing with UUIDs, transactional commit checkpoints, and fencing to prevent zombie processes from writing outdated data.
Flow builds on this foundation by adding real-time connectors, schema enforcement, and end-to-end materialization logic. The result is a platform that brings exactly-once semantics from message capture all the way to your destination systems, without extra glue code or coordination layers.
Next, we’ll take a look at some of the common pitfalls and practical limitations you should be aware of.
Practical Limitations and Common Pitfalls
Even the best systems can't promise exactly-once delivery in every situation. There are always trade-offs. Knowing where the edges are can help you avoid surprises and build more resilient data pipelines.
It’s often “exactly-once under normal conditions”
Many systems, including Estuary Flow, offer exactly-once behavior when things are working as expected. But no system can fully prevent all edge cases in the face of power loss, split-brain failures, or disk corruption. It’s important to understand the scope of guarantees and test how your system behaves during restarts, crashes, and retries.
Confusing delivery with processing
One of the most common misunderstandings is assuming that just because a message was delivered once, it was also processed once. Or vice versa. But these are separate concerns. Delivery is about transport. Processing is about what you do with the message. If either side breaks, you can still end up with duplication or missing data.
Idempotency bugs
Even if your system supports retries or deduplication, your application logic might not. For example, a service that creates a record every time it sees a message could accidentally insert the same thing twice. Small mistakes in logic, like checking for duplicates after an action instead of before, can quietly break your exactly-once guarantees.
Performance trade-offs
Getting to exactly-once often means giving something up. Tracking checkpoints, handling retries, and verifying commits all add overhead. You might notice higher latency or increased resource usage, especially under load. That’s why some systems default to at-least-once unless you explicitly configure them otherwise.
Poor error handling
Some systems only offer exactly-once if you implement your side correctly. If your consumer crashes before storing a checkpoint, or if you forget to flush buffered data, things can fall out of sync. That’s why it's important to test not just happy paths, but also failure modes like timeouts, disconnects, and out-of-order messages.
Limitations in destination systems
Estuary implements several patterns to provide exactly-once guarantees in as many scenarios as possible. But if your destination system itself doesn’t include strong transaction guarantees or doesn’t handle idempotency well, there’s only so much Estuary can do to shore up those gaps. Connectors with at-least-once semantics rather than exactly-once will be marked explicitly in Estuary’s documentation.
Exactly-once is powerful, but it’s not a silver bullet. To get it right, every part of your system — the broker, the consumer, the destination, and your logic — has to work together.
Next, we’ll talk about when exactly-once is truly necessary and when it might be more than you actually need.
When Do You Really Need Exactly-Once?
Exactly-once delivery sounds great, but in many situations, it's more than you actually need. Before adding the extra complexity, it's worth asking if your use case truly requires it.
When it matters most
There are some cases where exactly-once delivery is not just nice to have. It’s essential. These include:
- Financial transactions
You don’t want a customer to be charged twice or not at all. Payments, transfers, and ledger updates need strict guarantees. - Inventory and order systems
If you ship the same product twice or forget to update inventory, it can throw off fulfillment, lead to customer issues, or mess up reporting. - Real-time bidding or auctions
If the same bid is processed twice, or missed entirely, it can directly impact revenue or fairness. - Metering and usage-based billing
Overcounting or undercounting usage can either cost you money or frustrate customers. Either way, it damages trust.
In cases like these, it’s worth the engineering effort to make sure each message is processed exactly once and no side effect is repeated.
When other guarantees are good enough
On the flip side, many applications work just fine with at-least-once or even at-most-once semantics. For example:
- Logs and analytics
If a few log lines are duplicated or dropped, no one notices. Dashboards still show the big picture. - Email and notifications
Sending the same alert twice is better than not sending it at all. At-least-once is perfectly acceptable. - Monitoring and metrics
Most metrics systems already deduplicate or smooth out anomalies over time.
In these cases, chasing exactly-once might not be worth it. Instead, focus on making your pipeline reliable, observable, and able to recover gracefully when something goes wrong.
Up next, we’ll cover best practices that can help you design safer systems, even when exactly-once isn’t guaranteed.
Engineering Strategies and Best Practices
Exactly-once delivery is hard, but you can still build systems that behave reliably and avoid surprises. Even if the underlying infrastructure doesn’t guarantee exactly-once, these strategies can help you get close.
Make your processing logic idempotent
Idempotency is one of the most powerful tools in your toolkit. It means your system can safely process the same message more than once without causing duplicate effects.
Examples:
- Set a status to "paid" instead of incrementing a balance.
- Use upserts instead of inserts when writing to a database.
- Check if a message was already processed before taking action.
If your logic is idempotent, even at-least-once delivery becomes much safer to work with.
Use unique identifiers
Give every event, transaction, or message a unique ID. This makes it easy to detect and skip duplicates. Many systems, like Estuary Flow, use these IDs behind the scenes to ensure safe processing.
If you’re building your own logic, consider storing a history of processed IDs in a database or cache, at least for a short time window.
Choose systems that support strong guarantees
Not every part of your stack needs to support exactly-once, but it helps to use components that get you closer. Look for systems that offer:
- Checkpointing
- Idempotent operations
- Transaction support
- Built-in deduplication or fencing
Estuary Flow, for example, handles much of this for you by coordinating delivery, checkpointing, and materialization in one flow.
Plan for failures
Design your system to recover cleanly. Expect retries, timeouts, restarts, and out-of-order messages.
- Store checkpoints regularly.
- Make retries safe.
- Monitor your system for missed or duplicated messages.
- Add compensating logic if needed, such as reversing a charge or correcting a record.
Keep observability in mind
Logging, metrics, and traces are essential. If something goes wrong, you need to know exactly where and when it happened. Good observability can save hours of debugging and prevent silent data corruption.
These best practices don’t guarantee perfection, but they give you a solid foundation. Combined with a system like Estuary Flow, they help you get close to exactly-once behavior without rebuilding everything from scratch.
Next, we’ll look at how some popular systems handle delivery guarantees in the real world.
Case Studies and Industry Examples
Many well-known data systems claim different delivery guarantees. Some offer full exactly-once support, while others focus on simpler models with tools to help you build reliability on your own.
Here’s how a few of them approach the problem.
Apache Kafka
Kafka supports exactly-once semantics, but only when producers, brokers, and consumers are configured correctly. It uses transactional APIs to bundle message writes and offset commits together. This prevents partial processing and duplicate deliveries, but it requires extra coordination and comes with some performance cost.
Kafka works best when all components are built to participate in the transaction. If not, it falls back to at-least-once delivery.
Google Pub/Sub
By default, Pub/Sub offers at-least-once delivery. Messages might be delivered more than once, especially if acknowledgments are delayed. To help reduce duplicates, you can attach a message ID and deduplicate on the consumer side within a short window.
This gives you something close to exactly-once in practice, but only if your consumer logic is designed for it.
NATS JetStream
JetStream provides more control over message delivery with support for stream replay, acknowledgments, and message deduplication. Like Pub/Sub, it leans toward at-least-once by default. You can get closer to exactly-once by building your own idempotent consumers.
Estuary Flow
Estuary Flow provides exactly-once materialization, meaning data is delivered and committed to the destination only once, even across failures or retries. It does this using:
- A persistent recovery log
- Coordinated checkpoints and store operations
- Fencing and idempotent apply logic
- Support for both transactional and non-transactional destinations
Because Flow handles message delivery, change data capture, and materialization in a single system, it reduces the number of places things can go wrong. You don’t have to build your own fencing or deduplication layer. It’s already part of the protocol.
Each of these tools has strengths and trade-offs. The right choice depends on your needs, your scale, and how much complexity you're willing to take on. In the final section, we’ll wrap things up and help you decide how to move forward.
Conclusion
Exactly-once delivery is one of the most desirable guarantees in data engineering. It helps protect systems from overcharging customers, duplicating shipments, or losing key business insights. But as you’ve seen, getting there isn’t easy.
It requires more than just reliable message transport. You need coordination between delivery and processing, careful handling of retries and duplicates, and logic that doesn’t break under pressure. Even then, no system can promise exactly-once behavior in every failure scenario.
That’s why it’s important to step back and ask what level of reliability you actually need. Sometimes at-least-once is good enough. Other times, like in financial systems or real-time bidding, only exactly-once will do.
If you do need strong guarantees, you don’t have to build everything from scratch. Estuary Flow gives you exactly-once materialization out of the box, handling the hard parts like checkpoints, retries, and state coordination for you.
The goal isn’t to chase perfection. It’s to build systems that are predictable, testable, and safe, even when things go wrong.
Now you know what exactly-once really means, why it’s hard, and how to approach it with confidence.

About the author
Team Estuary is a group of engineers, product experts, and data strategists building the future of real-time and batch data integration. We write to share technical insights, industry trends, and practical guides.
