YES. You should look for viable alternatives to Fivetran to save on ETL/ELT in 2024.
Short on time? Here’s the 90-second rundown on why.
- Chiefly, the buyer has the negotiating power. There are many other platform options today, some of which are cheaper and more robust than Fivetran.
- Fivetran is simply too expensive (likely why you are here) for anything but small teams working with small data volumes. There is a litany of reasons (and online commentary) on why this is the case, namely:
- The shift to ‘Monthly Active Rows’ pricing. The pricing formula is complex, difficult to predict, and excessive table normalization of source API schema creates unnecessary spend
- Occasionally you are forced to load all data from a source, further driving cost
- Suboptimally configured parameters drives excessive cost when reading from certain sources
- 5 minute refreshes are available only for premium plans, users in general often complain of long backfill times and delays
- You lose control of your pipeline and data as it moves through Fivetran architecture
- No ability to transform data before it lands in the warehouse
- Each pipeline must be built from scratch – no ability to re-use a connector to a table. This further drives monthly active rows and egress fees
- There are many suitable alternatives that exist, both paid and open-source. We explore these in more depth below
- Estuary Flow is one such open-source or fully-managed option… we describe how we can help save you 70% off Fivetran, while having millisecond refreshes, and more control over your data…. Pretty…. Pretty…. Pretty good.
For those not already Fivetran users, first, a quick review…
What is Fivetran?
Firstly, let’s give some credit where credit is due. Fivetran was early to bring to market a solution that helped companies replicate data from their apps and databases to their cloud warehouses. At the time, most traditional ETL vendors were of an older software generation and had lesser support for cloud destinations.
The initial impetus for the company came in 2015 when Founder George Fraser was working on pitching a very different idea, when a prospective customer asked “Can you help us move our data into Redshift?”.
A lightbulb flashed, and since then the company has quickly become a leader in the data integration space, and is heavily synonymous with the ‘modern data stack’ and ELT movement.
With hundreds of connectors in a UI, no need for scheduling, incremental loading (CDC), and exactly-once semantics, everyone from analysts to data engineers can spin up functional data pipelines to help continuously sync data between applications.
So why did Fivetran get so popular, and why do some users want to leave?
Understanding Fivetran’s Fast Rise
Well, Fivetran’s popularity has historically been driven by:
1. A Self-Serve distribution model with connector-based pricing, paid as you go. At the time, managed ETL tools were almost exclusively available only by enterprise contract.
2. They made it easy for analysts to spin up managed and automated pipelines, saving the data engineering team time in certain instances.
3. Fivetran offered a wide variety of new SaaS-based connectors… at a time when SaaS tools and data used internally at companies was exploding. Most traditional ETL tools did not have these newer connectors in abundance. Fivetran is effective for SaaS based connectors though users report more trouble when working with transactional DB’s. If using the HVR-Fivetran connectors post acquisition, this may improve performance.
Why has Fivetran faced online backlash?
While many companies and users continue to be happy with, and endorse Fivetran, the mood among some has shifted. This is for both external and self-created reasons.
First, externally, the competitive landscape has simply shifted meaningfully. At one time it was one of very few managed ways to automate data pipelines to the cloud. But since 2015, many upstarts in data witnessed Fivetran’s fast growth and jumped in the market. These second generation competitors (e.g. Estuary, Airbyte, Rivery), were able to learn from Fivetran’s successes and failures to help teams build more reliable and faster data pipelines, cheaper.
Internally, the company has agitated some users with a) technical performance issues b) questionable choices in prioritizing connector builds (e.g. over 2+ years for an S3 connector!) and c) by switching the pricing model from connector-based to monthly-active-rows (see below).
Let’s break each down in more depth.
Top 3 reasons to use a Fivetran alternative
Here are the most common reasons we see users seeking an alternative ELT/ETL data integration platform to Fivetran.
1. Fivetran is too expensive
- The high price is primarily driven by the monthly active rows pricing model, which serves as a proxy for data volume. Beyond being difficult e to predict the MAR for an upcoming month, Fivetran’s approach to over normalizing tables can massively artificially spikes the number of rows created for many integration (e.g. 10 rows created from a single Hubspot row extracted)
- Fivetran is a point-to-point ELT system. By this we mean, every new end to end pipeline must be created from scratch. Already captured the ‘Customers’ table from MySQL? Too bad. Be prepared to pay again when you want to add another destination.
- You may even be charged thousands for rows you tell Fivetran to skip. Take this particular user.
Beyond just the direct cost paid to Fivetran, overreliance on the Modern Data Stack ELT setup *can* be wholly inefficient to begin with. Pushing all your source data into DBT, which itself has usage-based pricing now, will drive downstream data costs. Using a platform with more EtLT functionality can help cut costs.
On the whole, while using Fivetran can win in the build vs. buy evaluation, there are simply too many high performance Fivetran alternatives now to justify paying much of a premium. The buyer has power in this era.
Our pricing calculator shows how Fivetran is estimated to be 10x more expensive than Estuary for the same amount of data.
2. Using Fivetran means sacrificing control over your data
Fivetran does have many great data delivery features like managing schema drift (done by fully copy-pasting tables) and exactly once semantics, however, the data team sacrifices substantial control. Specifically:
- There is no ability to transform data before loading to DBT/warehouse
- You have no access to the connector code or read/write schemas
- Your data is held by Fivetran and you have direct access to this data lake where it is held while in transit
Further, Fivetran does not have auto-discovery of data semantics, data governance, and data quality capabilities. This lack of data management capabilities requires companies to have a separate data management system (DMS).
3. Fivetran is too slow on refreshes, backfills, and support
Lastly, for the lower tier plans, the syncs start at 1 hour+. For many teams this will be suitable. But for teams that can drive business value with instant decisioning, real-time AI, and real-time re-marketing, this is less than ideal. Teams with high velocity data and a need for streaming pipelines will have to supplement Fivetran with a streaming broker or another tool. Effectively being forced into a lambda architecture of maintaining two different ETL pipeline infrastructures.
Ok, we’ve covered a lot. Now let’s get to what you really want to know…
6 Of The Best Fivetran Alternatives 
We’ve compiled a comparison of the 6 of the most common Fivetran alternatives - both open-source and paid.
To be clear, there are *many* (read: far too many) choices to sort through when thinking about building a data pipeline from X to Y. A google search for any combination of sources and sinks will yield hundreds of results.
It is impossible to review every platform with technical nuance, so here are the most robust options that offer full ELT/ETL automation (e.g. not covering Airflow, Kafka Connect, etc…)
This includes Estuary, Matillion, Stitch Data, Oracle Golden Gate, Airbyte, and Qlik.
1. Estuary – Real-time and batch ETL/ELT at >50% less
Ok, so we are giving ourselves (Estuary) a pat on the back first. But let us justify why…
- Estuary is generally <50% the price of Fivetran. Check out our pricing calculator if you missed it before. We price simply and predictably on GB moved per month and # of connectors. Customer often come to us save on their Netsuite connectors which represents the majority of their FT spend.
- Estuary is a real-time ELT/ETL platform that can help engineers and business users alike build batch AND streaming pipelines. Our unified batch and streaming platform is just as easy to set up as Fivetran, all while offering the low-latency and cost-savings of Kafka, when applicable for the source.
- We provide access to 300+ connectors, including:
- Custom built millisecond connectors for Change Data Capture from DBs
- You can ‘Bring your own connector’ through our open protocol
- Ability to access hundreds of more connectors by importing a connector from open-source sites like Airbyte, Meltano, and Stitch
- Take more control of your data to control costs. Or enable more automation with features like:
- Streaming SQL & Typescript transforms
- Automated Schema Drift
- Schema Validation
- Ability to time travel your data and go to history mode
- Connect data once, endlessly materialize. Data captured into Estuary is stored for you as a real-time data lake in your cloud storage. This means:
- Once you connect to a source, you never have to re-capture or incur more egress/ingress fees for it.
- Delivering to an additional destination can begin in seconds – less time waiting on backfills
- You can transform, join, and audit,across both real-time and history data
- Backfilling data and ‘replaying the log’ from any point in time is easy
The platform has turnkey support for your databases, SaaS, cloud providers, and warehouses because of the hundreds of connectors already pre-built into it. You can browse the complete list of sources and destinations here.
Pros And Cons Of Estuary
|More than 50% less than Fivetran. Free up to 10gb/mo.
|Estuary is a newer solution with less searchable reviews
|Unified Batch and Streaming pipelines built in a no-code UI
|Not as many SaaS integrations as some alternatives
|Custom connectors for low-latency CDC. Ability to bring your own or import from open-source.
|Can’t be deployed on-prem as of Jan 2024
|Transform in streaming SQL or typescript
|Real-time data capture and processing.
Matillion is a data integration platform with cloud-ready use cases that helps in the orchestration and transformation of data. The software allows quick ingestion of data from any source and has connectors and pipelines that are easy to deploy.
Mattilion’s Change Data Capture also helps simplify data pipeline management and batch loading from a singular control panel. The platform is also low-code for building pipelines.
What’s more, Mattlion’s ELT system has data lakes, preparation, and analytics all rolled into one platform, making it the most versatile option on our list.
Pros And Cons Of Mattilion
|No-code setup makes it easy to deploy on the cloud platform of your choice
|No free version for Change Data Capture.
|Can ingest data from multiple and varied sources
|No real-time pipelines
|Sync data from various data sources including production databases and APIs and Amazon Redshift
|Change Data Captures requires deploying an agent
|Push down query performance with Snowflake
Stitch is another no-code platform for building batch data pipelines. It is owned by data integration giant Talend, and has a focus on supporting business intelligence sources and destinations. Stitch is the paid and managed version of the open-source ETL project, Singer. All Stitch connectors can be accessed freely at Singer.
Platforms such as IndieGoGo, Postman, and InstaPage all utilize the platform for their data integration and management requirements.
As of late 2023, users appear to be reporting increasing issues in reliability with Stitch. Connectors appear to be maintained less than in the past and pipelines are failing with zero alerting.
Stitch has compatibility with other data platforms too and can present and offload data after acquiring it from some of the most popular data sources such as:
- SQL Server
- Oracle Eloqua
- Microsoft Azure
- Amazon PostgreSQL
Read our detailed Fivetran vs. Stitch comparison
Pros And Cons Of Stitch
|Very little to no support
|Integration with open-source project Singer to bring your own connector/customization. Integrates well with Talend.
|Backfilling data is reported to be difficult
|Cheaper than Fivetran (though also based on rows so pricing difficult to forecast)
|Of late 2023, users report that core integrations don’t seem to be maintained well and connectors may fail with no reporting
4. Oracle GoldenGate – Best For Managing Replication Data
Oracle GoldenGate is a real-time mesh platform that allows users to analyze and interpret data within the Oracle Cloud Infrastructure. The software uses replication to keep data highly available and enables real-time analysis.
GoldenGate also lets users integrate data with existing technologies such as:
It provides real-time capturing of CSV, JSON, and Avro files in Kafka and JMS queues too. Users have the option to stream transactional data into Kafka, REST, Autonomous Database, or other data lakes.
The program has extensive API compatibility and even has additional features such as automated scaling which helps companies and analysts achieve optimal workloads without downtimes and delays.
And with automated monitoring on its graphical user interface, the program is the best tool to remain updated with your data processing progress.
Pros And Cons Of Oracle GoldenGate
|Integration with Oracle Cloud Infrastructure
|Documentation is lacking compared to other options
|Offers data replication without missing items
|Pricing is expensive compared to other products
|Straightforward and easy configuration makes it easy to deploy
|Detailed error-reporting, making troubleshooting easier
Airbyte is a popular open-source ELT project that made a big splash in 2022 by raising $150mm at a $1.5 billion valuation to take on Fivetran. They have a large community of users contributing to their large pool of connectors (350+), and have launched a fully managed offering.
In addition to a large pool of connectors, they also offer a no-code ‘connector builder’. Sufficiently technically savvy enough users can help build net new connectors using the UI.. It may not be simple enough for the average analyst, but it may speed up a DE. There isn’t much on the internet as far as detailed reviews here.
Pros And Cons Of Airbyte
|Large pool of open-source connectors (350+) and large community of users
|Many users complain it just isn’t ‘production grade software’
|Fully managed solution is much cheaper than Fivetran
|No ETL functionality… just ELT
|Despite its flaws, many users continue to use the open-source as ‘the price is right’
|Users report it as slow with databases and large data
|Lot of funding to stick around for awhile
|At-least once delivery
6. Qlik Integrate – Best For Comprehensive Data Integration
Qlik Data Integration is a modern platform that delivers real-time analytics-ready data to any environment. The integration platform has support for many environments from Qlik to Tableau, Power BI, and more.
The platform features real-time data streaming using CDC and can extend enterprise data into live streams. This helps businesses power modern analytics and microservices with a simple universal solution.
Qlik Integrate can also quickly design, build, deploy and manage purpose-built cloud data warehouses without manual coding. Automation of data is a breeze too using its managed data lake creation techniques.
Pros And Cons Of Qlik Integrate
|Can create a sophisticated reporting system
|Difficult to implement and get going
|Integrations speed is quick
|Qlik is primarily a data analytics platform, so its ETL capabilities aren’t as robust as others on this list
|Can create visual dashboards easily
What To Look For In A Fivetran Alternative?
Every data team has different needs and requirements. It is impossible to find a Fivetran alternative that “suits all”. Here is a rough criterion on which to evaluate ETL/ELT platforms.
How does the vendor price? On gb/moved? On active rows? Annual contract or pay as you go? Any free model?
B. Connector Availability
How many connectors exist? How do you add or request new connectors? Can you bring your own?
C. ELT vs. ETL
Does the company support pre-load transforms?
How often is data refreshed? Milliseconds? 5 minutes? Hours? Days?
E. Change Data Capture Method
Log-based? Trigger-based? Need to install an agent? Are hard deletes passed through?
F. Delivery Semantics
Exactly once or at-least once?
A data integration platform must support operating environments both on-premises and in the cloud. Since ETL tools are still present and in use, having the option to support cloud-based and server-based solutions is key for any data integration platform.
While the world is transitioning away from in-warehouse data storage and management, an entirely cloud-based option may not be the best yet since the transition isn’t still complete. Having a tool that can perform a double duty is essential.
We did our best to provide you with a summary of Fivetran’s pro’s and cons, why/how it is so expensive, and what are the best Fivetran alternatives.
With some due diligence and effort, you can certainly migrate to save a pretty penny of budget on your ETL costs, all while accessing faster pipelines and more in-flight or pre-load transformation features.
Naturally, we are biased towards Estuary Flow (our platform), and have helped many Fivetran converts upgrade to streaming pipelines at >50% the price reduction.