
Estuary’s focus on in-house crafted connectors isn’t an accident.
It’s not about keeping secrets; we’re not a black box factory and connector source code is publicly available for anyone to review. It’s about maintaining the responsibility of ownership, starting with a high-quality base product, and refining from there.
Integrations are specifically designed to work seamlessly with Estuary, providing standard customization options and converting data to standard formats with as little waste as possible. And connectors get continuous updates to keep up with API changes or finetune performance.
Our MongoDB capture connector recently received one of these upgrades: while the connector reliably got the job done, it could fall behind in high-volume enterprise use cases. This could be especially detrimental for real-time pipelines that counted on the connector’s functionality with MongoDB’s change streams—if the connector couldn’t keep up with the data coming in, downstream systems could experience delays.
For real-time native applications, even small slowdowns have an outsized impact. Consider the route change notification for a shipment that arrives just after a driver misses the turnoff. Or a triage system that doesn't capture the latest developments in its priority calculations.
It was definitely time for some optimization work.
On the case was Mahdi Dibaiee. Based in Dublin, Ireland when not on adventures around the world, Mahdi has been a Senior Software Engineer with Estuary for nearly four years. Having worked on data planes, Estuary’s flowctl CLI, and various connectors, his deep knowledge of the platform lets him flexibly pick up whatever tasks have current top priority.
This is a behind-the-scenes look at how he analyzed the existing implementation’s limitations, researched solutions, and ended up with double the speed.
The Problem with Small Documents
“Make this integration faster,” while a laudable goal, isn’t much to go on. Why were captures falling behind? What was the expected throughput rate? And how could we find specific areas to improve?
First, start with a baseline.
The MongoDB capture connector tended towards a throughput rate of 34 MB/s when working with standard-sized documents, such as those around 20 KB apiece.
To test how the connector would react under different circumstances, Mahdi tried it out against a stream of much smaller documents, each around 250 bytes.
Something concerning happened when the connector processed these small documents. The capture’s ingestion rate dropped down to a meager 6 MB/s. While it would be unlikely to find this “tiny document” use case in the wild, 6 MB/s was still far too slow.
It also uncovered a possible path forward.
“This told us that we had a large overhead-per-document,” Mahdi explained, which resulted in the abysmal slowdown.
Essentially, all document processing would include some overhead. Changing the size of processed documents acted as a lever to quickly check just how much the overhead impacted performance: smaller documents with the same amount of overhead per document led to more overall time spent on the overhead rather than on making progress.
If he could find ways to reduce that overhead, all pipelines should speed up, not just ones with tiny documents.
But where exactly did that overhead come from? To tune the MongoDB capture’s performance, some digging would be required.
The Reason Behind the Bottleneck
To get a picture of the systems involved, Mahdi profiled a particular MongoDB capture that was struggling to keep up with its load.
First up was to rule out a couple obvious answers. He checked CPU load and memory pressure on both MongoDB’s side and the capture connector’s side. Neither indicated any issues.
Next, Mahdi wanted to see where Estuary spent the most time when ingesting data from MongoDB. He set up a detailed tracing view, dividing up the time for each data fetch and marking out network and CPU activity.
The trace exposed two areas of note: one a suspiciously empty space, and one a suspiciously long process, both related to the connector call to get more documents. In total, this caused Estuary to spend around two seconds on each batch of fetched data, which isn’t quite the millisecond latency Estuary aims for.
So, what was actually happening?
600ms at the beginning of this cycle corresponded to the data fetch itself. When one batch of data finished processing, the connector sent out a request over the network for more, then started working on the new batch once it arrived.
Because of this synchronous mode of operation, the connector essentially sat around waiting for half a second each time it wanted to check for new data. When working with an end-to-end real-time system, those milliseconds in the pipeline add up. Not to mention the cumulative CPU idle time when the CPU’s doing nothing much for a full quarter of the connector’s process.
There, then, was an obvious bottleneck, but the activity following the fetch was also curious. The remaining 1.4 seconds in the cycle were spent processing documents.
By itself, emitting documents and checkpoints to Estuary shouldn’t take that long. But there was one more step in the processing phase that might: decoding MongoDB’s BSON documents in the first place.
With the possibility of optimizing document processing in the mix, there were two routes forward, two avenues to improve the connector’s performance.
Why not implement both?
From Go to Rust: An Expedient Solution
The CPU’s idle time was perhaps the more straightforward fix. Mahdi immediately identified that making the connector slightly more asynchronous would keep the CPU busy and shave those 600ms off of each batch.
To do so, he modified Estuary’s MongoDB connector to pre-fetch the next batch while still processing the first. In order to preserve both ordering and load on memory, he limited the number of fetched batches to four. With a maximum of 16 MB for each MongoDB cursor batch, this would keep the connector’s memory consumption to 64 MB.
This change alone would provide a welcome performance boost, but there was still the unsatisfyingly slow document processing time to contend with. And it was a trickier problem.
To standardize data coming from and going to a variety of different systems using a variety of different document formats and data types, Estuary translates everything to JSON as an intermediary. This makes it simple to mix and match data sources and destinations, or plug in a new connector: each connector only needs to handle its specific system and translation to or from the shared language.
MongoDB documents come in BSON, or Binary JSON. This modified version of JSON generally makes for efficient storage and retrieval. It also includes a handful of additional data types, such as datetime and more specific numeric types.
This sounds like it would make for a reasonably simple conversion, but Mahdi found that Estuary’s MongoDB connector spent a lot of time decoding documents with Go’s bson package. On reflection, perhaps this wasn’t much of a surprise. Go’s reflect package, which infers data types that aren’t already known, is notoriously slow and the bson package relied heavily on reflect.
Looking for alternatives, he first performed some benchmarks on Rust’s corresponding bson crate. The results were demonstrable: Rust’s version was 2x faster than Go.
Mahdi’s meticulous research also uncovered another option. Rust’s most popular serialization/deserialization crate, serde, has a serde-transcode plugin crate. This transcoder can convert documents from one format to another without any intermediary layer, cutting down on unnecessary processing steps. With this, the BSON to JSON conversion could be 3x faster than the existing Go implementation.
serde couldn’t simply be swapped in as-is. Mahdi wrapped the out-of-the-box serializer in custom logic, extending the JSON conversion and sanitizing the data. The resulting implementation fit Estuary’s specific needs while retaining the 3x performance boost.
These changes would address both bottlenecks and refurbish the MongoDB capture connector.
End Result: Supercharged MongoDB Captures
One question remained: would these improvements hold up across various scenarios? Thorough testing commenced.
Mahdi started where it all began: the tiny documents scenario. He ran the MongoDB connector on a stream of small 250-byte documents, first using the main version before switching to use the improved branch. The measly ~6 MB/s throughput rate rose to around 17.5 MB/s, tripling throughput for the small documents use case.
Of course, this scenario was only ever meant as a test and example, a way to define how much overhead we were seeing as the connector processed documents.
Mahdi therefore reran the test, this time using 20 KB documents, a more standard size. The original 34 MB/s rate jumped to 57 MB/s, almost doubling throughput.
This rate was much more reasonable, allowing for around 200 GB of data ingestion per hour and ensuring the Estuary connector could keep up with higher volume use cases.
What this means in practical terms is that:
- Huge initial databases would get backfilled in half the time
- The platform would be able to handle twice as much data in continuous CDC mode
- Which also means spikes in activity would be more easily handled: instead of choking performance, real-time events would stay real-time
After review and approval, Mahdi rolled out the changes to a select set of users first so he could closely monitor affected pipelines. He would be ready to quickly revert or revise as needed if any problems arose.
With so many use cases and interactions, one minor issue did raise its head: Rust and Go handled invalid UTF-8 characters differently. With a little more customization, Mahdi updated the connector’s leniency on invalid characters to mimic the former behavior.
Other than that, the rollout was smooth sailing, with capture throughput ticking upwards across the board.
So if you recently noticed your MongoDB capture speeding up: now you know.
What’s Next?
While 200 GB an hour is a decent clip, Mahdi noted that there is still room for further improvement. The main issue now is that the connector is relatively CPU-bound. And, after all, efficiency is one of those goals that doesn’t have a specific end.
For now, though, there are new challenges to face.
To test out the capture connector’s speed yourself, try it out in Estuary. Or set up a call to discuss how the connector could fit into your particular use case.
Or if you’re simply interested in switching to Rust for faster BSON decoding in your own code, check out Mahdi’s repo on benchmarking Rust and Go or his work in Estuary’s source code.

About the authors
Emily is a software engineer and technical content creator with an interest in developer education. She has experience across Developer Relations roles from her FinTech background and is always learning something new.
Mahdi is a software engineer at Estuary with a focus on integrations, working on multiple in-house connectors including Oracle, MongoDB and Salesforce. He has a diverse background including data engineering, devops, full-stack, functional programming and ML.






















