
From OLTP to OLAP: Syncing MySQL to ClickHouse for Real-Time Analytics
MySQL is a workhorse of transactional systems. It's simple, stable, and widely used, powering everything from e-commerce checkouts to SaaS user activity. But while it's great for reads and writes at scale, it's not exactly built for deep analytics.
As soon as teams start asking complex questions—“What are our top-selling products by region over the past year?” or “How is user behavior trending in real time?”—MySQL starts showing its limits. Long-running queries clash with day-to-day operations. Read replicas help for a while, but they come with lag and cost. Eventually, you're left with one option: move your analytical workloads elsewhere.
Enter ClickHouse.
ClickHouse is a high-performance, columnar database designed for lightning-fast analytics. It’s optimized for big scans, aggregations, and time-series analysis—everything MySQL wasn’t designed to do.
The real challenge? Getting your MySQL data into ClickHouse continuously, reliably, and without duct-taped ETL scripts.
That’s where Estuary Flow comes in. With built-in support for MySQL change data capture (CDC) via the binary log, Flow lets you replicate data from MySQL to ClickHouse in real time—no code, no Kafka, no hassle.
In this guide, we’ll walk you through:
- Why MySQL isn't ideal for large-scale analytics
- How ClickHouse delivers speed and flexibility at analytical scale
- How to set up real-time MySQL to ClickHouse replication using Estuary Flow
If you're struggling to get fresh insights without slowing down your transactional database, this is the sync you've been looking for.
Why MySQL Isn’t Built for Modern Analytics
MySQL excels at transactional workloads—fast inserts, updates, and simple lookups. It’s a great choice for powering applications, but once you start running complex analytical queries, cracks begin to show.
As data grows, so do the pain points:
- Aggregations and joins become slow
- Dashboards lag or timeout
- Read replicas struggle to keep up
- Analytical queries interfere with production traffic
You might try exporting data on a schedule or standing up read-only replicas, but those are short-term fixes. Batch ETL adds lag, and scaling vertically only buys time.
MySQL’s row-based storage, lack of vectorized execution, and tight coupling of reads and writes make it fundamentally limited for analytics. You end up choosing between up-to-date insights and application performance.
The truth is, MySQL wasn’t designed for large-scale, real-time analysis. If you want fast, flexible insights without slowing down production, you need a dedicated analytics engine.
That’s where ClickHouse comes in.
Why ClickHouse Complements MySQL
You don’t need to replace MySQL to improve analytics—you just need to offload the parts it wasn’t built for.
MySQL is excellent at handling operational data: orders, logins, transactions, and user updates. It ensures consistency, supports ACID properties, and powers mission-critical applications. That role shouldn't change.
ClickHouse, on the other hand, is built for speed and scale, perfect for analytical queries, dashboards, and real-time monitoring. It doesn’t compete with MySQL; it enhances it.
By syncing data from MySQL to ClickHouse, you get the best of both worlds:
- MySQL remains the source of truth, optimized for high-volume transactions
- ClickHouse becomes your analytics layer, tuned for exploration, aggregation, and speed
This separation of concerns reduces stress on production systems and opens the door to rich, real-time insights, without slowing down your apps or waiting on batch jobs.
How to Replicate MySQL to ClickHouse Using CDC and Estuary Flow
Traditionally, syncing data from MySQL to an analytics system like ClickHouse meant building custom ETL jobs, running batch exports, or deploying Kafka. These approaches often involve high latency, fragile scripts, and significant operational overhead.
ClickHouse even offers its own MySQL integration for simple syncs between the two systems—but it currently only supports SELECT
s and INSERT
s of your MySQL data from within ClickHouse. This isn’t the automated replication we’re looking for. After all, we’re trying to move our complex SELECT
queries out of MySQL entirely.
Thankfully, there’s a solution.
Estuary Flow lets you stream data from MySQL to ClickHouse in real time—no code, no Kafka, and no manual sync logic. It does this by leveraging MySQL’s binary log (binlog), which captures every row-level change as it happens. This process is known as Change Data Capture (CDC).
Here’s how it works with Flow:
- Flow connects to your MySQL database and reads directly from the binlog, capturing inserts, updates, and deletes in real time
- It transforms those changes into structured events and stores them in versioned, schema-aware collections
- Using Flow’s Dekaf module, those collections are exposed as Kafka-compatible streams, which ClickHouse can consume directly via ClickPipes in real-time
This architecture gives you a continuous pipeline that’s:
- Low-latency — Events reach ClickHouse seconds after they occur in MySQL
- Resilient — Built-in backfill, exactly-once or at-least-once delivery
- Schema-aware — Handles changes gracefully and enforces structure
- Fully managed — Flow handles orchestration, monitoring, and fault tolerance
Whether you’re using self-hosted MySQL, Amazon RDS, Aurora, or Cloud SQL, Estuary Flow supports it—and makes syncing to ClickHouse seamless.
Step-by-Step: Sync MySQL to ClickHouse with Estuary Flow
You can set up a fully managed, real-time replication pipeline from MySQL to ClickHouse in just a few steps using Estuary Flow—no Kafka, no scripts, and no infrastructure to manage.
Here’s how to do it:
Step 1: Prepare Your MySQL Environment for CDC
Before connecting to Flow, make sure your MySQL instance is configured for Change Data Capture using the binary log (binlog).
✅ Enable binlog with ROW
format
plaintextSET GLOBAL binlog_format = 'ROW';
✅ Set binlog retention (recommended: 7+ days)
plaintextSET PERSIST binlog_expire_logs_seconds = 604800;
✅ Create a capture user with replication access
plaintextCREATE USER IF NOT EXISTS 'flow_capture'@'%'
IDENTIFIED BY 'your_password';
GRANT REPLICATION CLIENT, REPLICATION SLAVE, SELECT ON *.* TO 'flow_capture';
✅ Set the time zone, if using DATETIME
fields, to avoid conversion issues:
plaintextSET PERSIST time_zone = 'America/New_York'; -- Or your region
If you're using Amazon RDS, Aurora, Google Cloud SQL, or Azure MySQL, follow the platform-specific setup steps to enable binlog access and networking.
Step 2: Capture Data from MySQL in Estuary Flow
- Log into Estuary Flow
- Go to Sources > + New Source
- Select MySQL from the connector catalog and click Capture
Now configure your connection:
- Name: A unique name like
mysql_orders_capture
- Data Plane: Choose your preferred processing region
- Server Address: Your MySQL host (e.g.,
db.example.com:3306
) - Username / Password: The
flow_capture
credentials
If your database is behind a VPC or firewall, configure SSH forwarding via the Network Tunnel
section.
Click Next, test the connection, and proceed.
Step 3: Select Tables and Define Collections
Once connected, Estuary Flow auto-detects CDC-enabled tables.
- Select one or more tables you want to replicate
- For any table without a primary key, manually assign a collection key
- Review and customize the schema (optional)
Flow will generate versioned collections for each table, schema-enforced and ready for downstream streaming.
Click Publish to deploy your MySQL capture.
Step 4: Configure ClickHouse as the Destination
- Go to Destinations > + New Destination
- Search for ClickHouse and choose Materialization
- Enter your destination settings:
- Name: e.g.,
clickhouse_orders_sync
- Data Plane: Should match your capture’s region
- Auth Token: Set a secure token — used by ClickHouse to authenticate
- Name: e.g.,
- Under Source Collections, click Modify and link your MySQL collections
Click Publish to finalize the materialization.
Behind the scenes, Flow uses Dekaf to expose collections as Kafka-style topics, ready for ClickHouse to consume—no Kafka cluster needed.
Step 5: Connect ClickHouse to Estuary via ClickPipes
In your ClickHouse Cloud UI:
- Go to Integrations → ClickPipes
- Add a new Kafka pipe with the following:
- Broker:
dekaf.estuary-data.com:9092
- Protocol:
SASL_SSL
- SASL Mechanism:
PLAIN
- Username: The full Flow materialization name (e.g.,
your-org/clickhouse_orders_sync
) - Password: The auth token you set in Step 4
- Schema Registry URL:
https://dekaf.estuary-data.com
- Broker:
- Select the topics (1 per table)
- Map the fields to your ClickHouse schema
- Save and activate the pipe
Within seconds, your MySQL data will begin streaming into ClickHouse in real time.
3 Reasons to Stream MySQL to ClickHouse
Setting up a real-time sync from MySQL to ClickHouse isn’t just about modernizing your stack—it directly improves performance, visibility, and engineering velocity.
Here’s why teams are making the switch:
1. Offload Analytics Without Harming Transactions
Running analytical queries on your primary MySQL instance adds risk: slowdowns, replication lag, and blocked transactions. By replicating data to ClickHouse, you isolate your production database from reporting workloads, keeping apps fast and users happy.
ClickHouse becomes your dedicated analytics layer, optimized for heavy reads and complex queries. MySQL stays lean and focused on serving application traffic.
2. Real-Time Insights Without ETL Overhead
Batch pipelines and scheduled exports always lag behind reality. With Estuary Flow and MySQL CDC, changes stream into ClickHouse seconds after they happen. No more waiting for hourly jobs or stale dashboards—your analytics are always current.
Whether you're tracking user events, product metrics, or operational KPIs, ClickHouse gives you real-time visibility at scale.
3. No Kafka, No Scripts, No Headaches
Most change data pipelines require stitching together Kafka, custom consumers, and monitoring glue. With Flow, the entire process is fully managed: from initial backfill to streaming sync.
You configure it once, and it just works. No brokers to maintain, no retries to debug, no custom code to fix when schemas evolve.
Final Thoughts: Replicate MySQL to ClickHouse the Right Way
MySQL was built for transactions. ClickHouse was built for analytics. When you connect them using real-time replication, your architecture becomes both reliable and insightful.
With Estuary Flow, you can stream changes from MySQL to ClickHouse in minutes, without writing code, deploying Kafka, or managing brittle pipelines. The result? Real-time insights, clean separation of workloads, and a data stack that actually scales.
Next Steps
- Create your Estuary Flow account — Build your first MySQL-to-ClickHouse pipeline in minutes
- Explore tutorials — Learn how Flow works with MySQL, ClickHouse, and other systems
- Join our Slack community — Get support from engineers and real-time data pros
- Talk to us — We’ll help you design the right pipeline for your team, data volume, and latency needs
Related Guides
- MySQL to BigQuery — Sync MySQL to BigQuery for warehouse-scale reporting
- MySQL to Snowflake — Build real-time CDC pipelines to Snowflake
- MySQL CDC Explained — Understand binlog-based change capture in depth
FAQs
1. What is the best way to replicate MySQL to ClickHouse?
2. What tools can I use to sync MySQL to ClickHouse?
3. Can I replicate data from MySQL to ClickHouse manually?
4. Is CDC a better alternative to batch ETL for syncing MySQL to ClickHouse?

About the author
Team Estuary is a group of engineers, product experts, and data strategists building the future of real-time and batch data integration. We write to share technical insights, industry trends, and practical guides.
