Estuary

How to Sync MySQL to Elasticsearch in Real-Time (No Code + Logstash Options)

Learn how to sync MySQL to Elasticsearch in real time using Estuary Flow or Logstash. Capture updates, deletes, and schema changes with zero code.

MySQL to Elasticsearch
Share this article

To power fast search, analytics, or observability on top of your MySQL data, you need a way to move that data into Elasticsearch efficiently and in real time. But building a reliable pipeline between the two can be complex, especially if you’re relying on custom scripts or tools that don’t support schema changes or deletions.

In this guide, you’ll learn how to sync MySQL to Elasticsearch using two proven methods:

  • no-code, real-time pipeline using Estuary Flow, powered by Change Data Capture (CDC)
  • manual approach using Logstash with JDBC, based on Elastic’s documented method

Whether you're building a full-text search feature, real-time dashboards, or log analytics, this tutorial will help you set up a robust MySQL to Elasticsearch integration in just a few minutes.

Skip the complexity. Set up your MySQL to Elasticsearch pipeline in real time with Estuary Flow →

What Is MySQL?

MySQL To Elasticsearch - MySQL Stats
Data source - db-engines.com

MySQL is one of the world’s most widely used open-source relational database management systems (RDBMS). It’s the backbone of countless web and enterprise applications — from eCommerce platforms to SaaS tools and content management systems.

MySQL is built on SQL (Structured Query Language) and offers a highly stable, performant, and well-supported database layer. It’s backed by Oracle and maintained through regular updates.

MySQL is ideal for use cases that require:

  • ACID-compliant transactions
  • Structured data relationships
  • CRUD operations at scale
  • High availability through replication or clustering

It's a reliable primary data store — but not built for fast, flexible search at scale. That’s where Elasticsearch comes in.

What Is Elasticsearch?

MySQL To Elasticsearch - Elasticsearch Solutions and Features

Image Source

Elasticsearch is a distributed, open-source search and analytics engine designed for fast, scalable querying across all types of data — structured, semi-structured, unstructured, and geospatial. It’s built on Apache Lucene and forms the core of the Elastic Stack (formerly ELK Stack), which also includes Logstash, Kibana, and Beats.

Elasticsearch excels at full-text search, real-time indexing, and log analytics. It’s used by companies like GitHubCisco, and Citigroup to power search bars, monitor infrastructure, and surface insights from large volumes of streaming data.

Common Use Cases

MySQL To Elasticsearch - Uses of Elasticsearch
Image Source
  • Search bars and autocomplete for websites and apps
  • Real-time log and metrics analysis (e.g., DevOps, observability)
  • Security analytics and threat detection
  • Business intelligence dashboards
  • Application performance monitoring (APM)
  • E-commerce product search and personalization
  • Geospatial queries and location-based filtering

Elasticsearch supports official clients in popular languages like Python, Java, JavaScript (Node.js), Go, .NET, and PHP, making it a versatile choice for developers building search-driven applications.

Should You Use Elasticsearch Over MySQL?

MySQL To Elasticsearch - Using Elasticsearch in MySQL
Image Source

While both MySQL and Elasticsearch store and query data, they’re built for fundamentally different purposes — and they work best when used together, not as replacements for each other.

  • MySQL is your source of truth — a transactional RDBMS designed for structured data, relationships, and ACID-compliant operations.
  • Elasticsearch is an analytics and search engine, optimized for full-text search, real-time indexing, and lightning-fast querying over massive datasets.

Why You Should Combine Them

Use MySQL to:

  • Manage core application data
  • Handle structured relationships (foreign keys, joins)
  • Perform inserts, updates, deletes (CRUD operations)

Use Elasticsearch to:

  • Enable fast, fuzzy, and complex searches
  • Power real-time dashboards and analytics
  • Index and query logs, metrics, and unstructured data

By syncing data from MySQL to Elasticsearch, you can:

  • Retain MySQL as your reliable operational database
  • Offload complex search and aggregation to Elasticsearch
  • Maintain resilience — even if Elasticsearch goes down, your source data is intact

Why Replacing MySQL with Elasticsearch Doesn’t Work

  • Elasticsearch isn’t built for transactional data or relational integrity
  • Mapping complex schemas from MySQL to JSON docs can be messy
  • Handling updates and deletes requires custom logic
  • It’s difficult to enforce constraints like foreign keys or uniqueness
  • Elasticsearch has limited support for multi-table relationships

The best approach: Use MySQL as your primary database and Elasticsearch as a high-performance, secondary search layer.

In the next section, we’ll explore two ways to sync data between the two platforms, including a real-time, no-code method using Estuary Flow.

Method 1: Integrate MySQL and Elasticsearch with Estuary Flow

If you're looking for a fast, reliable, and maintenance-free way to sync MySQL with Elasticsearch — without writing code — Estuary Flow is the best solution.

Flow uses log-based Change Data Capture (CDC) to stream inserts, updates, and deletes from MySQL to Elasticsearch in real time. It handles schema changes automatically and can be configured in just a few clicks through its web app.

"Flow is 1000x times better than LogStash or Elastic Enterprise Data Ingestion Tool (for moving data from MySQL to Elastic)." - Pompato

Before You Start

Make sure you have:

Step 1: Capture Data from MySQL

  1. Log in to Estuary Flow
  2. Click Sources > + New Capture
  3. Select the MySQL connector
    Various MySQL capture connectors you can use as Estuary sources

  4. Enter your MySQL credentials (host, database, user, password)
  5. Click Next — Flow will list all available tables
  6. Select the tables you want to sync and click Save & Publish

Flow will now create CDC-based data collections for each selected MySQL table.

Step 2: Materialize to Elasticsearch

  1. Click Materialize Collections on the capture confirmation screen
  2. Choose the Elasticsearch connector
    BigQuery to elasticsearch - elasticsearch connector page

  3. Enter your Elastic cluster URL, username, and password
  4. Map each MySQL data collection to a target Elasticsearch index
    • Optionally apply field overrides for type mapping
  5. Click Next > Save & Publish

All historical data from MySQL will be copied into Elasticsearch, and any new data changes will sync automatically within milliseconds.

Enterprise-Ready Features

  • Secure Deployment Options: Estuary supports fully managed, private cloud, or BYOC deployments
  • Automatic retries & checkpointing: Ensures data consistency and resilience
  • Schema evolution: Adjusts to changes in MySQL schema without pipeline breakage
  • Low-latency streaming: Ideal for powering dashboards, APIs, and search experiences

🔗 Try Estuary Flow for free or reach out to our team for help with production pipelines and enterprise deployments.

For more help with this method, see the Estuary Flow documentation on:

Method 2: Integrate MySQL and Elasticsearch with Elastic

Alternatively, you can use Elastic’s documented process using Logstash, which we’ve adapted below.

Note that this method is more challenging to implement, and won’t capture MySQL DELETE events. Though it’s an event-based integration, it’s not log-based change data capture; it instead relies on timestamps. Learn more about types of MySQL CDC here.

Gather Credentials

To specify the details of the connection, you’ll need your Cloud ID. Go to the Kibana main menu and select “Management” > “Integrations” > “View Deployment Details”.

To authenticate, you’ll use your Elasticsearch API key.

Get Logstash and the MySQL JDBC driver

  1. Download and install Logstash.
  2. Download and unpack the JDBC driver and take note of the driver’s location for later use. 

Add Timestamps to MySQL Tables

For each table you plan to move to Elasticsearch, you’ll need a timestamp column with the last updated time. 

This example adds a column called modification_time to the table my_table in the database my_db.

plaintext
USE my_db; ALTER TABLE my_table ADD modification_time TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP;

Set up a Logstash pipeline with the JDBC input plugin

Next, you’ll need to set up a pipeline to ingest data from MySQL via the JDBC plugin

  1. Create a new file called jdbc.conf in <localpath>/logstash-7.12.0.

    Paste the code below into the file to generate the Logstash pipeline, being sure to substitute your credentials and configuration details.
plaintext
input {  jdbc {    jdbc_driver_library => "<driverpath>/mysql-connector-java-<versionNumber>.jar"    jdbc_driver_class => "com.mysql.jdbc.Driver"    jdbc_connection_string => "jdbc:mysql://<MySQL host>:3306/es_db"     jdbc_user => "<myusername>"     jdbc_password => "<mypassword>"     jdbc_paging_enabled => true    tracking_column => "unix_ts_in_secs"    use_column_value => true    tracking_column_type => "numeric"    schedule => "*/5 * * * * *"    statement => "SELECT *, UNIX_TIMESTAMP(modification_time) AS unix_ts_in_secs FROM es_table WHERE (UNIX_TIMESTAMP(modification_time) > :sql_last_value AND modification_time < NOW()) ORDER BY modification_time ASC"  } } filter {  mutate {    copy => { "id" => "[@metadata][_id]"}    remove_field => ["id", "@version", "unix_ts_in_secs"]  } } output { elasticsearch { index => "rdbms_idx" ilm_enabled => false cloud_id => "<DeploymentName>:<ID>" cloud_auth => "elastic:<Password>" ssl => true api_key => "<myAPIid:myAPIkey>" }  }

With the new configuration saved, launch Logstash: 

plaintext
bin/logstash -f jdbc.conf

This method works for lightweight ingestion tasks, but doesn’t scale well for complex pipelines or production-grade applications.

✅ For log-based CDC, schema-aware syncing, and automated maintenance, Estuary Flow is a faster and more reliable option.

Conclusion

While MySQL excels at storing structured, transactional data, Elasticsearch is purpose-built for lightning-fast search and real-time analytics. Connecting the two allows you to power advanced use cases — from full-text search to observability dashboards — without compromising your core database.

You have two main options:

  • Estuary Flow: A no-code, real-time pipeline with built-in support for CDC, schema changes, retries, and high-volume scalability. Ideal for production systems that need to stay in sync with low latency.
  • Logstash with JDBC: A DIY option that works for simpler use cases, but lacks deletion tracking, real-time sync, and automation, and requires significant ongoing maintenance.

Our recommendation: use MySQL as your system of record, and Elasticsearch as a secondary index for search and analytics. Let a streaming tool like Estuary Flow keep them perfectly in sync, so you can focus on building.

Next Steps

Related Articles From Estuary

FAQs

    The most reliable way to sync MySQL with Elasticsearch is by using a real-time, CDC-based pipeline like Estuary Flow. It captures inserts, updates, and deletes with minimal configuration and ensures your Elastic index stays in sync with your MySQL database at all times.
    Yes, Logstash can ingest MySQL data into Elasticsearch using the JDBC input plugin, but it has limitations. It doesn’t support deletes, relies on scheduled polling (not CDC), and requires manual setup. It's more suitable for basic or one-time ETL jobs, not real-time sync.
    To keep MySQL and Elasticsearch in sync continuously, use a tool like Estuary Flow. It uses log-based Change Data Capture (CDC) to stream all database events — including inserts, updates, and deletes — to Elasticsearch within milliseconds.

Start streaming your data for free

Build a Pipeline
Share this article

Table of Contents

Start Building For Free

About the author

Picture of Jeffrey Richman
Jeffrey Richman

With over 15 years in data engineering, a seasoned expert in driving growth for early-stage data companies, focusing on strategies that attract customers and users. Extensive writing provides insights to help companies scale efficiently and effectively in an evolving data landscape.

Related Articles

Popular Articles

Streaming Pipelines.
Simple to Deploy.
Simply Priced.
$0.50/GB of data moved + $.14/connector/hour;
50% less than competing ETL/ELT solutions;
<100ms latency on streaming sinks/sources.