Estuary

7 min read

Last updated: July 6, 2026

It's Never Been Easier to Move Data

Estuary provides agent skills. Are these skills sufficient for an AI agent to set up real-time replication? I decided to find out.

Mark Van de Wiel

Share this article

Summarize this page with AI

Start Building For Free

Historically a barrier for the use of technology is training. How do I use technology to implement an end-to-end solution? What terminology does a tool use? What are the concepts? How do I monitor, troubleshoot or tune an implementation?

Technology - especially complex, business-oriented technology - can be intimidating. The technology may have many options. The terminology may not be intuitive. Ease of use may have had limited focus.

You may choose to avoid technology that is intimidating, or postpone its use until somebody teaches or shows you how to use it.

Enter skills.

What is a SKILL?

A skill is a set of instructions and (references to) resources for an AI agent (e.g. Claude Code, Codex, Claude Cowork, Gemini CLI) to perform a task. Think of the skill as the cheat sheet the agent uses to know how to perform a task. For example, you may use a skill to create a presentation with a prescribed set of sections, your organization's look and feel, stored in the correct location, and following an agreed naming convention.

A skill is documented in a human-readable markdown (.md) file. To promote the use of agents, skills are provided free of charge on Github by AI companies (e.g. Anthropic, OpenAI, Google). Several skill marketplaces have sprung up. And companies are releasing skills to make it easier for users to adopt their technology. So has Estuary.

Moving data

Throughout my 2-decade+ career I have used a variety of technologies to move and transform data, in batch and close to real-time, using hand-written scripts (e.g. shell scripts, Python, Oracle PL/SQL, JavaScript) and with the help of tools (e.g. Talend, GoldenGate, HVR, Fivetran). Every approach comes with a learning curve, and every technology introduces new concepts and terminology.

However, the goal is always the same: move data from a source to a destination.

Real-time database replication using Estuary's Agent Skills

I was curious to find out whether I could use Estuary's Agent Skills to keep two databases in sync. This is a scenario I have run countless times throughout my career.

Below are the tools I used, all on a Linux environment:

postgres:latest from hub.docker.com
Docker and docker compose - below is my compose file:

plaintextservices:
 postgres-source:
   image: postgres:latest
   container_name: pgsrc
   ports:
     - "5433:5432"
   environment:
     - POSTGRES_PASSWORD=<some hard to guess password>
     - POSTGRES_DB=pgsource
     - POSTGRES_HOST_AUTH_METHOD=trust
   command: [ "postgres", "-c", "wal_level=logical" ]
 postgres-target:
   image: postgres:latest
   container_name: pgtgt
   ports:
     - "5434:5432"
   environment:
     - POSTGRES_PASSWORD=<some hard to guess password>
     - POSTGRES_DB=pgtarget
     - POSTGRES_HOST_AUTH_METHOD=trust

ngrok to reach the databases on my system
HammerDB to generate a TPCC schema and workload
Claude Code with the Estuary plugins
flowctl, based on instructions here; with hindsight I assume installing flowctl upfront was redundant, as the Estuary Agent Skill bundle includes flowctl setup.

With this toolset I configured and ran real-time data movement between my PostgreSQL databases with only a few instructions for Claude Code–in plain English–in less than an hour.

plaintext❯ Configure replication from PostgreSQL all tables in database tpcc with target PostgreSQL into a new database. 
                                              
● I'll help you set up PostgreSQL-to-PostgreSQL replication. This needs two pieces: a capture (PostgreSQL CDC from tpcc) and a materialization (to your new PostgreSQL database). Let me start with the capture skill.
 
● Skill(estuary-captures:capture-postgres-create)
 ⎿  Successfully loaded skill
 
● Let me check that flowctl is installed and authenticated before gathering details.
 
● Bash(which flowctl && flowctl --version 2>&1 | head -5)
 ⎿  /usr/local/bin/flowctl
    flowctl v0.6.9
...

Along the way Claude asked a few questions about how I want replication to work. Claude Code decided to validate the data based on row count, and it fixed an issue that caused data integration (materialization in Estuary terms) to fail.

plaintext● Now I understand the exact failure. Crucially, every validation error pointed at flow://inferred-schema#/... — the relaxed-write-schema part passed the NULL document. The relaxed-write-schema already makes all fields nullable/optional; it's only the strict, lagging inferred schema that rejects new shapes.

So the robust fix is to drop the flow://inferred-schema dependency from each read schema, keeping only relaxed-write-schema. That tolerates nulls and new/absent columns permanently, while the write schema still gives properly typed columns in the target. Let me apply it to all 9 collections.

...

The end result: data was flowing close to real-time, with source and destination staying in sync.

plaintext📊 PostgreSQL → PostgreSQL replication statistics

Volume (last 60 min)

 ┌───────────────────────────────────────────┬───────────┬──────────┐
 │                                           │ Documents │   Data   │
 ├───────────────────────────────────────────┼───────────┼──────────┤
 │ Captured (source → Estuary)               │ 2,344,061 │ 1,059 MB │
 ├───────────────────────────────────────────┼───────────┼──────────┤
 │ Materialized (Estuary → target, last 20m) │ 1,763,045 │ 755 MB   │
 └───────────────────────────────────────────┴───────────┴──────────┘

Top tables by volume captured
 ┌────────────┬───────────┬────────┐
 │   table    │   docs    │  data  │
 ├────────────┼───────────┼────────┤
 │ order_line │ 1,221,577 │ 442 MB │
 ├────────────┼───────────┼────────┤
 │ stock      │ 561,389   │ 376 MB │
 ├────────────┼───────────┼────────┤
 │ customer   │ 121,591   │ 119 MB │
 ├────────────┼───────────┼────────┤
 │ orders     │ 122,125   │ 36 MB  │
 ├────────────┼───────────┼────────┤
 │ new_order  │ 101,125   │ 22 MB  │
 ├────────────┼───────────┼────────┤
 │ item       │ 100,000   │ 26 MB  │
 ├────────────┼───────────┼────────┤
 │ history    │ 76,450    │ 23 MB  │
 └────────────┴───────────┴────────┘
 
Throughput (materialization)

- Peak: 7,687 docs/sec
- Average under load: 4,739 docs/sec
- The connector adaptively scales batch size with load — transactions ramped from 4 docs up to 102,392 docs/txn to drain backlog quickly.

I had not used Estuary before.

Without a doubt I achieved my goal of replicating data between two databases in near real-time far quicker with Estuary than through any of the other methods I have used before. The provided Agent Skills accelerate the initial deployment, and are particularly useful to troubleshoot. Along the way I started to learn Estuary concepts and terminology, and the experience sparked my interest to learn more about the technology.

Do agent skills solve all challenges?

Estuary provides a comprehensive set of skills. These cover the source (capture) and destination (materialize) sides, as well as operational (data movement health) tasks. As an Estuary novice, the logs, task health, and task stats skills are particularly useful as - per my experience - they diagnose issues and look for solutions.

Skills drastically lower the barrier to get started using technology. Rather than watching Youtube videos and digging into documentation to struggle through an initial use case, the agent gets you started following the instructions in the skill. The use of skills makes you more productive, allowing you to get more work done. In fact, given an agent is much faster at processing instructions than we are, results should sooner be better with the use of skills.

Estuary's Agent Skills solve the data movement challenge for a straight data movement scenario. However, the Agent Skills do not address–and cannot be expected to address–pesky aspects of the real world that have historically slowed down setting up data movement. For these, additional skills would be required.

Let me illustrate.

My environment was a test environment without any sensitive data. I had full (master) access to the PostgreSQL instances to configure replication and access data.

In a corporate environment you may not have such conveniences. Who is in charge of the system? Can they enable replication prerequisites? Who is creating an account (user) and can secure access privileges? Likewise on the target: what schema should the data be stored in? What account can you use? Does the account have sufficient privileges?

Organization-specific skills can address these topics. Will you be privileged to execute them?

For what a production-grade Postgres pipeline actually needs to handle once you move past a test setup, deletes, schema drift, replication-slot safety, and freshness, see Postgres CDC for agents.

Conclusion

Thanks to Estuary’s Agent Skills, the amount of time it takes to set up data movement is approaching the amount of time it takes to move the bytes, with or without an understanding of the technology.

However, data movement is often a subset of a larger data consolidation initiative with downstream use cases such as analytics and AI. Transformations and data cleansing may be required. Estuary supports transformations but as of this writing no agent skills are available to help configure and deploy these.

Beyond data movement, your use case may require data processing and visualization that Estuary technology was not designed to solve. For this, you'll need more skills, provided by another vendor, your organization, or (how old-fashioned) just by leveraging your own.

Want to try it yourself?

Set up your first real-time pipeline with Estuary's Agent Skills and Claude Code.

Get started with Agent Skills

About the author

Mark Van de Wiel

Mark Van de Wiel was the Field CTO at Fivetran from 2022 through 2025, where he guided enterprise customers in optimizing their data integration strategies. Mark joined Fivetran in 2021 through the acquisition of HVR, where he led US operations and played a key role in scaling the business. His prior experience includes technical leadership roles at Oracle, Actian and GoldenGate Software. Today, Mark is building software using the latest and greatest AI provides.

It's Never Been Easier to Move Data

What is a SKILL?

Moving data

Real-time database replication using Estuary's Agent Skills

Do agent skills solve all challenges?

Conclusion

Start streaming your data for free

About the author

Streaming Pipelines.

Simple to Deploy.

Simply Priced.