
Postgres CDC for AI Agents: Keeping Agentic Workflows Grounded in Real-Time Data
Postgres CDC grounds AI agents in fresh, trustworthy business data by continuously streaming database changes, reducing stale context, protecting production Postgres, and making agentic workflows safer, more observable, and more reliable.

AI agents can now create complete databases, which is pretty impressive. But can these agents keep them grounded in your business's current state?
Databricks recently published some statistics that should make every data engineer sit up a little straighter: on Neon (the managed Postgres company Databricks acquired last year), AI agents now create 80% of all databases and 97% of database branches.
At first glance, that’s a huge number. But at a second glance, you might ask: how is that even possible?
Well, it means agents are becoming a new class of database users: creating development environments, branching production-like datasets, generating schemas, running queries, writing migrations, and wiring themselves into the data layer that was once touched mostly by application developers, DBAs, and data engineers.
Meanwhile, enterprise software is moving quickly from “AI assistant in the sidebar” toward task-specific agents embedded directly in business workflows. There are various studies (Gartner, Databricks, to name a few) that predict that most enterprise applications will include task-specific AI agents by the end of 2026.
So, yes, agents are coming for the database.
Still, spinning up a database branch is trivial nowadays. What’s harder is making sure the agent has fresh, trustworthy context when it decides what to do next.
If you're a data engineer, platform engineer, or part of an AI infrastructure team or analytics team building agents or RAG systems on top of operational Postgres data, this article is for you.
Without spoiling anything, here's the TL;DR: if agents act on operational business state, the pipeline between Postgres and the systems that feed those agents becomes part of AI correctness. That pipeline needs CDC, freshness visibility, safe replication-slot management, delete propagation, schema evolution, and observability.
The model is not usually the weakest part
When an agent makes a bad decision, people tend to look at the model first. You might ask:
- Was the prompt or retrieval bad?
- Did the LLM hallucinate?
- Did the tool call fail?
- Did we need a better eval?
Sometimes, the answer to these is yes, which will send you on a lengthy debugging journey. But in operational systems, the boring answer is often the correct one: the agent was simply looking at old data. We often see this in practice at Estuary when our users switch from batch to real-time data integration methods.
For example, consider a basic customer-support agent that can issue credits. You can verify that the underlying model is fine, the policy documents are up to date, and the tool permissions are correct. But, what happens if the customer upgraded their plan just 15 minutes ago, and the warehouse table the agent retrieves from is refreshed hourly? The agent sees the old plan, applies the wrong refund policy, and confidently explains the decision in perfect English.
As you can see, this isn't a model failure, but a data freshness failure. The model did exactly what it was asked to do with the context it was given.
That is why the agentic AI conversation keeps circling back to data architecture. By now, most enterprises have experimented with agents, but few have actually scaled them to deliver tangible value. They all run into similar blockers and revelations: agentic AI cannot run on stale data. Most agentic AI use cases are expected to require real-time, contextual, widely accessible data.
This is where Postgres enters the picture.
Why AI agent workflows keep coming back to Postgres
For a huge number of companies, the operational state of the business starts in Postgres.
Customers. Orders. Subscriptions. Payments. Tickets. Permissions. Inventory. Workflows. Product usage. Internal tools. Audit trails. Feature flags. Entitlements. These are the boring tables that actually decide what the business is allowed to do.
That matters because Postgres is not a niche database sitting off to the side of the AI stack. In Stack Overflow’s 2025 Developer Survey, PostgreSQL was the most-used database among all respondents with more than 55%, and among professional developers at 58%. That’s very similar to the numbers we see at Estuary.
Postgres is also being pulled directly into AI application architectures. pgvector gives Postgres open-source vector similarity search, with support for exact and approximate nearest-neighbor search, multiple vector formats, and distance functions such as L2, cosine, and inner product.
To top it all off, Neon’s MCP Server lets agents manage Postgres projects, branches, and databases with natural-language commands, run SQL, and make schema changes.
That's exactly where we're headed: Postgres is becoming a critical part of the agent infrastructure layer in addition to being the OLTP database behind the app.
Your agent’s world model is probably a batch job under the hood
Here's a common architecture hiding underneath a lot of “AI agent” demos:
This architecture made sense when the end user was a human looking at a dashboard. But times have changed.
If a revenue dashboard is 45 minutes late, that may be acceptable. Still not ideal, but acceptable. Someone can ask, “When was this last refreshed?” Someone can sanity-check a number, or they can decide not to act yet.
On the other hand, agents actually compress that timeline. They retrieve context, reason over it, call tools, and trigger actions inside the same workflow. The latency between “observed state” and “business action” becomes a lot more important.
This thought process culminates in the paradigm “operational now plus analytical history”, which is actually the heart of the problem.
As a best practice, an agent (or a person!) usually shouldn't hammer production Postgres for every decision. You don't want a fleet of semi-autonomous workflows issuing unpredictable queries against the database that runs your app. But you also don't want the agent making decisions based on stale warehouse tables or an embedding index that missed the last 10,000 rows of changes.
The missing layer is a reliable, real-time data plane between Postgres and the systems where agents retrieve, analyze, and act.
How Postgres CDC works for AI agent pipelines
The agent stack is full of new abstractions: tools, skills, MCP servers, eval harnesses, multi-agent supervisors, memory stores, vector indexes, graph RAG, workflow engines. Some of these will stick, but a lot of them will already be renamed twice by next quarter.
CDC is older, duller, and fundamentally more important.
For Postgres, CDC usually starts with the write-ahead log, or WAL. Every committed change goes through WAL. Logical replication lets an external consumer decode those changes into row-level events. A publication defines which tables should be published and the replication slots track how far the consumers have read.
At a high level:
A CDC event for an updated subscription looks roughly like this:
json{
"op": "u",
"schema": "public",
"table": "subscriptions",
"primary_key": { "id": "sub_123" },
"before": {
"plan": "trial",
"status": "active"
},
"after": {
"plan": "enterprise",
"status": "active"
},
"source": {
"lsn": "16/B374D848",
"commit_timestamp": "2026-05-21T13:04:32Z"
}
}That event can update a warehouse table, invalidate a cache entry, trigger an embedding refresh, publish into a Kafka-compatible consumer, or update a feature table. The same operational fact can move to several places without each destination polling Postgres independently.
That's actually the exact shape agent infrastructure looks for.
The key is that the agent doesn't need to query the primary DB for every fact; it needs the systems around it to stay synchronized with the source of truth.
pgvector: embeddings are only as fresh as the rows behind them
A lot of teams start with the simplest RAG architecture:
While it works, this diagram leaves out the critical part that often hurts in production: updates.
Suppose your product catalog lives in Postgres and you use embeddings to power an internal product support agent. A product row changes:
sqlUPDATE products
SET description = 'Now supports SOC 2 audit log export 🎉',
updated_at = now()
WHERE id = 42;If the product description changes, the embedding derived from that description becomes stale immediately. If the product was deleted, the vector should probably be deleted too. If the product moved from “available” to “deprecated,” the retriever shouldn't keep handing it to the agent as if nothing changed.
The correct architecture has to consider all of these cases. In our view, it looks something like this:
For some workloads, pgvector itself may be the right place to store embeddings because it lets you keep vectors with relational data and query them with SQL. For other workloads, a specialized vector system or search system may be better, but that choice is less important than the invariant:
If the source row changes, the agent-facing context has to change too.
Without sounding hand-wavy, this is where CDC becomes part of AI correctness. A stale embedding is an agent grounding issue. The same pattern shows up in real-time AI search pipelines, where search quality depends on continuously fresh data.
Common Postgres CDC failure modes for AI Agents
For AI agents, CDC reliability isn't just a pipeline concern. It directly affects whether the agent sees the current state of the business and can take safe, correct actions.
| Failure mode | Why it matters for AI agents | What to monitor |
|---|---|---|
| CDC lag | Agents may act on stale customer, product, subscription, or transaction data. | End-to-end freshness, source commit time, destination lag, and retrieval time. |
| Replication slot bloat | A stalled consumer can force Postgres to retain WAL and create disk pressure on the primary. | Retained WAL size, inactive slots, restart_lsn, confirmed_flush_lsn, and disk usage. |
| Missing deletes | Deleted or revoked records may still appear in downstream systems and agent context. | Delete propagation, tombstones, permission changes, and index cleanup. |
| Schema drift | Column changes can silently break pipelines or leave agents with incomplete context. | Schema changes, field mapping errors, type changes, and materialization failures. |
| Backfill pressure | Large snapshots can slow CDC catch-up and put extra load on Postgres. | Backfill progress, WAL growth, source load, destination throughput, and catch-up time. |
| Stale vector indexes | The source may be current, but the agent may still retrieve old embedded context. | Embedding refresh lag, changed-row processing, deleted-document removal, and retrieval freshness. |
These failure modes are why a production-grade Postgres CDC pipeline for AI agents needs more than basic change capture. It needs freshness visibility, replication slot monitoring, delete handling, schema evolution support, safe backfills, destination observability, and clear lineage from source data to agent context.
Replication slot management: risks every CDC pipeline must handle
There are several reasons why Postgres CDC is way more than “just turn on a connector.” The core mechanism that makes Postgres CDC reliable is also the mechanism that can hurt your primary DB if you operate it without paying attention.
A replication slot is Postgres saying, “I will not remove WAL that this consumer may still need.”
That's a great guarantee. Without it, a CDC consumer could go down, Postgres could recycle old WAL segments, and the consumer would have no way to resume without taking a new snapshot.
But the guarantee has teeth. PostgreSQL’s pg_replication_slots view exposes fields like restart_lsn, the oldest WAL address that may still be required by the slot, and confirmed_flush_lsn, the point up to which a logical consumer has confirmed receiving data. If the consumer stops advancing, restart_lsn stays behind, but Postgres actually keeps the WAL around.
And by default, the safety limit is simply not there. The Postgres docs for max_slot_wal_keep_size say that if the value is -1, the default, replication slots may retain an unlimited number of WAL files.
Read that again. Unlimited. Scary, right?
Actually, that's a reasonable default for preserving data, and avoiding data loss is the #1 priority for any DBA. But it's a dangerous default for an operational database if nobody is watching it.
The first time you see this failure mode, it doesn't look like an AI problem. It doesn't even look like a replication problem. It looks like a storage problem. While pg_wal is growing and the disk is filling up, the connector actually looks “quiet”. Meanwhile, the dashboard is stale and the warehouses are behind. The primary DB (and you) is getting closer to a bad day.
I've seen versions of this where the most important clue wasn't in the pipeline UI. It was an old transaction that shouldn't have been open anymore. In one customer story we like to use internally, a user left a transaction idle in their database for more than 16 days.
Sixteen days is long enough for everyone to forget the deployment, the test, the notebook, the migration session, or even the shell where it began, but Postgres doesn't forget. It keeps the state needed to preserve correctness.
The lesson: a quiet connector doesn't mean a healthy replication slot.
The community also has plenty of scar tissue here. In a Debezium user thread, a team running PostgreSQL with Debezium reported that the replication slot lag grew to more than 2 TB (!) on a database handling roughly 8 million new records per day across replicated tables.
Estuary recommends treating replication slots as production infrastructure, not simple connector bookkeeping: monitor retained WAL and slot lag, alert on restart_lsn or confirmed_flush_lsn that stops advancing, set an explicit max_slot_wal_keep_size, and have a documented recovery path for when a slot falls too far behind.
Learning from these best practices, you can clearly establish that CDC is production infrastructure, so make sure to treat it that way.
The monitoring query every Postgres CDC pipeline deserves
Let's take a look at a straightforward way of being able to check in on Postgres replication slot health. If you run Postgres CDC, keep a query like this close:
sqlSELECT
slot_name,
database,
slot_type,
active,
active_pid,
wal_status,
pg_size_pretty(
pg_wal_lsn_diff(pg_current_wal_lsn(), restart_lsn)
) AS retained_wal,
restart_lsn,
confirmed_flush_lsn,
inactive_since,
invalidation_reason
FROM pg_replication_slots
ORDER BY pg_wal_lsn_diff(pg_current_wal_lsn(), restart_lsn) DESC NULLS LAST;The exact columns available depend on your Postgres version, but the shape of the check is stable and helps you answer a plethora of important questions. You'll need to know if the slot is active, if restart_lsn is advancing properly, how much WAL is being retrained, if wal_status is healthy, and so on.
On top of all that, this data allows you to set up alerting on the things that matter:
- Retained WAL above threshold
- Slot inactive for too long
- Slot lag increasing monotonically
- Disk utilization rising
wal_status= lost or unreserved- Destination lag above threshold
- Capture task failing repeatedly
- Backfill running longer than expected
The threshold is, of course, workload-dependent. A 20 GB retained-WAL alert might be too noisy for one system and far too late for another. Just keep in mind to alert on byte lag and trend, not just if the connector process is up or not. A connector can be up and still not keep up, and a replication slot can be active and still falling behind.
These are different failure modes, so treat them differently if you can.
What production-grade Postgres CDC needs to get right for AI Agents
A Postgres-to-agent pipeline is not production-grade because it can move rows from A to B during a demo. Rather, it is production-grade when it behaves well during all the boring, ugly situations that actually happen. If you’ve dealt with random connector restarts, slow warehouses, destination rate-limits, abrupt schema changes, you’ve felt this pain before.
This is the checklist I would use before trusting a Postgres CDC pipeline underneath agentic workflows. If you are comparing vendors, this broader guide on how to evaluate CDC solutions covers related criteria like capture method, deletes, retries, backfills, schema changes, and operational burden.
Database upgrades are another place where the CDC state can become mismatched. In a customer’s Aurora PostgreSQL, they dropped and recreated the replication slot during an upgrade. The server’s WAL position reset, but the capture still had a stored resume cursor from the old slot, so the capture failed with a “resume LSN is greater than server flush LSN” error. The fix was to trigger an incremental xmin backfill to reset the replication cursor to the current WAL position without doing a full table re-snapshot.
With that established, let's take a look at how your CDC pipeline needs to behave to avoid these issues.
1. It must keep latency visible, not just “low”
Low latency can't just be a product claim; it has to be a number you can inspect.
For agent-facing data, you want to know multiple numbers; including source commit time, capture time, materialization time, and, of course, agent retrieval time.
As a rule of thumb: a pipeline that says “synced” but can't tell you whether the agent is reading data from 5 seconds ago or 50 minutes ago is not observable enough.
2. It must protect the primary
Your CDC pipeline shouldn't be allowed to take down the database it exists to protect.
At a minimum, that means monitoring replication slots and retained WAL with a simple query like this one:
sqlSELECT
slot_name,
slot_type,
database,
active,
wal_status,
pg_size_pretty(pg_wal_lsn_diff(pg_current_wal_lsn(), restart_lsn)) AS retained_wal,
restart_lsn,
confirmed_flush_lsn
FROM pg_replication_slots
ORDER BY pg_wal_lsn_diff(pg_current_wal_lsn(), restart_lsn) DESC;It also means setting a bound:
sqlALTER SYSTEM SET max_slot_wal_keep_size = '50GB';
SELECT pg_reload_conf();Always keep in mind: initial backfills can burn through WAL faster than expected.
The lesson here is that initial backfills need WAL budgeting, especially on high-write systems.
So how do you come up with this magic number? Well, the right value depends on write volume, disk size, recovery time, and how painful a re-snapshot would be. The wrong value is usually “unlimited by accident.”
The exact threshold depends on write volume, disk size, recovery expectations, and how expensive a re-snapshot would be. Test this in the context of your workload before applying it broadly.
Also, keep in mind that some columns in pg_replication_slots vary by Postgres version.
On newer Postgres versions, idle_replication_slot_timeout can invalidate slots that remain inactive beyond a configured duration; Postgres’ pg_replication_slots view also exposes inactive_since and invalidation reasons, including idle_timeout.
3. It must handle backfills without lying
Backfills are where many pipelines quietly lose correctness.
The takeaway from this story is that backfill chunk size can directly affect WAL retention and replication-slot safety.
This happens more often than you’d think, meaning a good CDC system needs a consistent way to say a few things:
- I snapshotted this chunk of the table as of this source position.
- I also captured all changes that happened during the snapshot.
- I can merge those without gaps or duplicates.
That sounds obvious until you try to backfill a multi-terabyte table while updates continue and downstream systems expect current data.
For agent systems, the subtle failure is worse than the obvious one. If the backfill simply fails, someone notices. If it succeeds but misses deletes or double-applies updates, the agent may retrieve facts that look plausible but are wrong.
4. It must understand deletes
A surprising number of “AI context” systems are append-biased, meaning they're good at adding documents and bad at removing deleted records downstream.
Databases are not append-only. Just to list out a few examples: customers churn, products are deprecated, tickets close, permissions are revoked, or you can have contracts amended. A CDC pipeline feeding an agent must propagate deletes and revocations with the same seriousness as inserts.
For Postgres logical replication, this also means caring about replica identity. If a table doesn't have a primary key, downstream consumers may not have enough identity to apply updates and deletes cleanly unless the table uses an appropriate replica identity setting, such as REPLICA IDENTITY FULL, in cases where that tradeoff is acceptable.
5. It must handle schema changes as a normal event
Agents often sit downstream of fast-moving product databases. A thing to keep in mind is that the source schema will change. A lot.
Columns will be added. Types will change. Tables will be split. JSON blobs will become real columns. Enums will grow. Someone will rename a field that the warehouse model depends on.
The pipeline cannot treat every schema change as a bespoke incident. It needs a schema evolution behavior that is explicit, observable, and recoverable.
6. It must fan out without turning into spaghetti
Agentic systems rarely have one destination. The same Postgres change will need to be updated:
- Snowflake or BigQuery for analytics
- Databricks or Iceberg for lakehouse workloads
- A vector index for retrieval
- A cache for low-latency serving
- A search index for hybrid search
- A reverse ETL destination for operational workflows
If every destination builds its own one-off extractor from Postgres, the primary becomes a public utility for every downstream system. That doesn't age well.
You want one well-operated capture path and many materialization paths.
A better architecture: real-time data plane between Postgres and agents
The architecture I trust looks more like this:
The important design choice here is isolation. You shouldn't make the primary Postgres database serve every analytical query, every vector refresh, every agent lookup, and every workflow automation directly because that will easily lead to incidents.
Instead, use CDC to move operational truth out of Postgres continuously, then let the right systems serve the right access patterns.
- Warehouses and lakehouses can handle large analytical joins and historical data.
- The vector system can serve semantic retrieval.
- Kafka-compatible consumers can feed event-driven services.
- The agent can retrieve the current context without becoming an unbounded workload on the primary.
This is where Estuary is a natural fit. In Estuary, the Postgres capture is the governed read path from the source database. Captured changes land in collections, which can then be materialized into warehouses, lakehouses, search systems, operational stores, or Kafka-compatible consumers through Dekaf. That lets teams operate one capture path from Postgres while serving multiple agent-facing access patterns downstream.
This fits agentic workloads, as your agent needs a trustworthy path from operational Postgres to the systems where it reasons and acts.
That path needs to be real-time enough for the workflow, safe enough for the primary, and observable enough that you know when reality and the agent’s context have drifted apart.
A concrete example: the support agent that should not query production
Let’s build a realistic support agent that can answer billing questions, summarize recent account activity, and recommend a next action to a human support rep. It needs data from a few sources:
- customers
- subscriptions
- invoices
- payments
- refunds
The naive architecture is very tempting:
That will actually work fine in a prototype, but it's also how you end up with unpredictable query patterns on your primary database.
The slightly better architecture is:
Now the primary database is safer, but the agent is stale.
The production architecture should look more like:
Now the agent can “intuitively” ask:
sqlSELECT
c.id,
c.name,
s.plan,
s.status,
i.latest_invoice_status,
u.events_last_24h,
t.open_sev1_tickets
FROM account_context_current c
JOIN subscription_current s ON s.customer_id = c.id
LEFT JOIN invoice_status_current i ON i.customer_id = c.id
LEFT JOIN usage_rollup_24h u ON u.customer_id = c.id
LEFT JOIN ticket_rollup_current t ON t.customer_id = c.id
WHERE c.id = $1;That table can be served from a warehouse, operational replica, low-latency analytical store, or materialized serving layer. The exact destination depends on the latency and concurrency requirements. The important point is that it's fed by CDC rather than a stale snapshot.
Now, imagine the customer upgrades. The subscriptions row changes in Postgres. A CDC event lands downstream, the current account view updates, and finally, the agent sees the new plan.
The LLM didn't get smarter at all, but the system did!
Keeping agent memory honest: why CDC governs more than pipelines
“Memory” sounds super abstract until you trace it back to rows. It’s a worthwhile exercise as it's a critical concept for agentic workflows.
A customer support agent might store summaries of previous tickets. A sales agent might store account research. A product agent might store recent usage anomalies. A compliance agent might store policy interpretations or audit events.
Models will generate some of that memory, and some of it will be retrieved from documents. But a lot of it is anchored to operational entities in Postgres.
If the source entity changes, the memory may need to change too. And with that, this becomes a governance problem as well as a retrieval problem. An agent that retrieves a deleted document, an old entitlement, or a revoked permission is not just “less accurate,” but it may actually be violating the rules of the system around it.
That is why agent observability simply cannot stop at model traces. You need data observability, too.
For any meaningful agent action, you should be able to answer questions like “Which source records contributed to this context?” and “When did they last change in Postgres?”, among other things.
If those questions feel too heavy for your current agent project, that's still a useful signal. It means the agent is still a demo, or the workflow is not important enough to automate yet.
Why agents raise the cost of CDC failures
CDC lag, schema drift, backfills, replica identity, etc. None of these data engineering problems are new.
What agents change is the cost of being wrong. Before agents, a stale table might produce a bad dashboard, which someone would notice in a meeting. Maybe a metric was off or maybe an analyst reran a job. Annoying, but usually recoverable.
With agents, stale context can become an action. You can send an email, issue a refund, or open a ticket.
That's an entirely different risk profile.
This is also why “the model has evals” is not enough. To emphasize this point, Databricks found that organizations using AI governance tools get over 12 times more AI projects into production, and organizations using evaluation tools move nearly 6 times more AI systems to production. That makes sense. But evals and governance need to include the data plane.
A model eval won't tell you that the subscriptions table in the agent context layer is 47 minutes behind. A prompt test won't tell you that a replication slot is retaining 200 GB of WAL. A tool-call trace won't tell you that your vector index missed deletes.
The agent stack needs data-plane SLOs. These can take on many forms, for example:
- 99% of subscription changes are visible to the agent context within 30 seconds.
- No replication slot retains more than 20 GB of WAL without an alert.
- All deletes from source tables are applied downstream within 60 seconds.
- Backfills must not block CDC processing.
- Agent actions must log source freshness metadata.
If that sounds like data engineering, that is because it's actually just data engineering. To make it clearer, here’s what a production-grade Postgres-to-agent pipeline should be able to do:
- Make freshness visible from source commit to agent retrieval.
- Protect the primary database from unbounded replication-slot growth.
- Handle backfills without gaps, duplicates, or long pinned transactions.
- Propagate deletes and permission changes.
- Treat schema changes as normal.
- Fan out to warehouses, vector systems, caches, and event consumers through one governed capture path.
Building a production-grade Postgres CDC pipeline for AI agent
The Databricks statistic is a great hook because it makes the agentic shift visible. In addition to “chatting with your data”, agents are now creating databases and branches, and they're participating in database workflows that used to belong to humans.
The important question to ask becomes: “Can the agent trust the data it is acting on?”
For many companies, that question runs straight through Postgres. In those cases, Postgres holds the operational truth, and the warehouse holds analytical context. A vector index might hold semantic context. Agents sit on top, pulling from all of it.
If the pipeline connecting those systems is hourly, fragile, unobservable, or unsafe for the primary, the agent will eventually expose it, mainly because agents are just simply impatient. They collapse the distance between stale data and business action.
So by all means, tune the prompts. Pick the right model. Add evals. Build the MCP server. Give the agent better tools. Add a human approval into the loop where the risk demands it.
But do not ignore the pipe!
In the agentic stack, Postgres CDC is more than just replication plumbing. It's the layer that keeps your agent connected to reality.
Building agents on Postgres data? Estuary helps you capture Postgres changes once and stream them continuously into the systems your agents rely on, including warehouses, vector indexes, caches, and event consumers.
FAQs
What should teams monitor before trusting Postgres CDC in an agent workflow?
How can replication slots create risk for AI agent pipelines?
Why do deletes and permission changes matter for AI agent context?

About the author
Dani is a data professional with a rich background in data engineering and real-time data platforms. At Estuary, Daniel focuses on promoting cutting-edge streaming solutions, helping to bridge the gap between technical innovation and developer adoption. With deep expertise in cloud-native and streaming technologies, Dani has successfully supported startups and enterprises in building robust data solutions.
















