B
Backfill
A backfill loads historical data from a source into a destination when a new pipeline is created, a new table is added, or when manually triggered.
Bring Your Own Cloud (BYOC)
Bring Your Own Cloud (BYOC) is a deployment option where Estuary's data plane runs entirely in the customer's own infrastructure.
C
Capture
A capture is the Estuary component that ingests data from an external source system into a collection.
Change Data Capture (CDC)
Change Data Capture (CDC) is a technique to capture data changes, such as inserts, updates, and deletes, as they occur.
Collection
A collection is a real-time, append-only log of JSON documents stored in cloud object storage, produced by a capture or derivation and consumed by one or more materializations.
Connector Catalog
Estuary's catalog of connectors includes real-time CDC, streaming, batch, and materialization connectors for warehouses, lakes, and AI infrastructure.
D
Data Flow
A Data Flow is an end-to-end pipeline in Estuary connecting one or more sources to one or more destinations.
Data Freshness
Data freshness is the gap between when a change happens in a source system and when it is visible in a destination.
Data Replication
Data replication is the continuous process of copying changes from a source system to one or more destinations, keeping them in sync.
Database Connector
A database connector is a pre-built component that connects a pipeline to a specific source database.
Delivery Semantics
Delivery semantics describes the guarantee a pipeline makes about how many times each change event reaches the destination.
Derivation
In Estuary, a derivation is a special data collection that applies transformation logic using SQL, TypeScript, or Python.
E
ELT
ELT is a data movement pattern with Extract, Load, and Transform steps: raw data lands in the warehouse, and tools like dbt handle the transformation step downstream.
Exactly-once Delivery
Exactly-once delivery is the guarantee that each record from a source reaches the destination precisely one time, with no duplicates and no dropped events.
I
Idempotent
An idempotent operation produces the same result whether it runs once or multiple times, which makes pipelines safe to retry after failures.
L
Latency
Latency is the technical measure behind data freshness — the gap in milliseconds or seconds between when data is created at a source and when it is available at a destination.
LLM (Large Language Model)
Large Language Models (LLMs) are a type of AI that works with natural language to power chatbots, copilots, and agentic workflows.
Log-based Change Data Capture
Log-based CDC reads committed changes directly from a database's transaction log and delivers them as a real-time ordered stream of change events.
M
Materialization
A materialization is the Estuary component that delivers data from a collection into an external destination, keeping it up to date as new change events arrive.
O
Operational Analytics
Operational analytics uses analytical data to drive real-time business actions rather than retrospective reporting.
R
RAG (Retrieval-Augmented Generation)
Retrieval-augmented generation is an AI pattern where a language model retrieves relevant context from an external data store before generating a response.
Real-time Data Pipeline
A real-time data pipeline moves data from source to destination continuously, with end-to-end latency measured in milliseconds to seconds rather than hours.
Reverse ETL
Reverse ETL moves data from a data warehouse back into the operational tools where business teams work, such as CRMs, ad platforms, and support systems.
Right-time Data
Right-time data means delivering data at the cadence each use case actually requires: sub-second, near-real-time, and scheduled batch.
S
Schema Evolution
Schema evolution is handling changes to data structure (added columns, renamed fields, changed types) as they occur.
Streaming
Streaming processes data continuously as a flow of events the moment it is produced, making it available to downstream consumers within milliseconds.
Streaming ETL
Streaming ETL applies the extract, transform, load pattern continuously rather than on a scheduled batch cadence.
T
Transaction Log
A transaction log is an ordered, append-only record of every committed change a database makes, used for crash recovery and the foundation of change data capture.
Trigger Based Change Data Capture (CDC)
Trigger-based CDC fires a database trigger on every insert, update, or delete, writing each change to a shadow table that the pipeline then reads.
V
Vector Database
A vector database stores high-dimensional numerical embeddings of text, images, or other data.

Subscribe to our newsletter
By subscribing I agree with Terms and Conditions.


