vs
Fivetran
Self-serve streaming data platform for building real-time ETL from DB, SaaS and filestores. Company behind Gazette and Estuary Flow OSS.
Self-service batch data tool for building ELT from DB, SaaS and Filestores.
n/a
Open-Source, or predictably priced pipelines at $0.50 / GB plus $0.14/hr (~$100/mo) for any capture or materialization.
Pricing based on monthly active rows with a utilization curve detailed here. Depending on data sources MAR can be converted to GB at around 500k to 1M MARs / GB.
Estuary averages 70% cheaper or more for the same pipeline. Fivetran bills can increase arbitrarily through their normalization of your data. (Pricing breakdown)
Estuary introduces <100ms to a data flow. Different sources and destinations impose their own latencies. Ex. SaaS Sources (1+ Hrs), DB's (~100ms), Warehouses (10s).
Latency depends on plan/price. Starter 1 hr, Standard 15 min, Enterprise 5 min. 5 min latency is 2x the price of the 1 hour latency.
Faster data leads to 2x better marketing outcomes (Gartner) and enables real-time ML and insights.
100+ connectors. Also HTTP file, webhook, and ability to spin up some new connectors within a week. Estuary improves and builds on OSS SaaS connectors to make them reliable.
Custom connectors can be made with a less polished connector builder than Fivetran.
340+ connectors. Also file-based ingestion and webhooks.
Custom connector builder available.
Fivetran has more connectors ready today. Estuary is generally faster at adding new ones.
Self hosted can be deployed now via OSS. In-VPC SaaS is coming winter 2023
In-VPC is offered for a different pricing tier.
Fivetran can be a good solution for on-prem data flows.
Exactly-Once
Exactly-Once
Exactly once semantics ensure that your source precisely matches your destination.
Automated Schema Evolution available. Most updates will automatically make it downstream. Breaking changes will reversion your destination.
Automated Schema Evolution to ensure that your destination exactly matches your source.
Automation will ensure that your destination always matches your source.
Yes
ELT Only.
Transforming data before it goes into your destination enables new features and can save money.
Ingested data stored in a real-time data lake in customers cloud storage
Every new pipeline is made from scratch and requires creating a new ingest
By storing data in a real-time data lake, you can endlessly distribute in real-time off one ingest, saving you egress fees, money, and source system stress.
SQL transforms in your warehouse via a DBT integration.
Plus streaming SQL and javascript transforms prior to your destination.
SQL transforms in your warehouse via a DBT integration.
Transformations in your pipeline can unlock new use cases and save money.
Pinecone
None
Teams quickly demanding support for vector DBs
Estuary supports de-duplicating data on your primary keys automatically. Users can set them to accomplish workflows like history mode, though they are not as simple currently as in Fivetran.
Supports history mode for simple self-service use cases.
Power vs simplicity is a trade off that teams need to make.
We're creating a new kind of DataOps platform thatempowers data teams to build real-time,data-intensive pipelines and applications, at scale,with minimal friction, in a UI or CLI. We aim to make real-time data accessible to the analyst, while bringing power tooling to the streaming enthusiast. Flow unifies a team's databases, pub/sub systems, and SaaS around their data, without requiring new investments in infrastructure or development.
Estuary develops in the open to produce both the runtime for our managed service and an ecosystem ofopen-source connectors. You can read more about our story here.