Estuary

How to Connect MongoDB to Elasticsearch: Step-by-Step Guide

Seamlessly connect MongoDB to Elasticsearch with three easy methods. Unlock real-time integration and streamline data management.

Share this article

Ensuring seamless business operations and maintaining high data quality often requires the complex process of migrating data across various platforms. One popular integration involves connecting MongoDB to Elasticsearch. When you load data from MongoDB, a popular NoSQL database, to Elasticsearch, a powerful search and analytics engine, you can leverage enhanced querying and real-time analytics for more insightful data analysis and decision-making.

In this article, we will look at three ways you can connect MongoDB to Elasticsearch to optimize your database systems.

MongoDB Overview

mongodb to elasticsearch - mongodb

Image Source

MongoDB is a popular NoSQL database and is categorized as a document-oriented database. It stores data in BSON (Binary JSON) format, allowing you to manage complex data within a single document.

MongoDB is renowned for its horizontal scalability, making it an ideal choice for applications with rapidly growing datasets. It can seamlessly handle increasing workloads and deliver high performance by distributing data across multiple servers or nodes. This horizontal scaling capability is particularly beneficial for unpredictable or dynamic data growth applications, offering the flexibility to add more machines to increase your database storage.

Apart from its impressive scalability, MongoDB is also known for its automatic sharding capabilities. This feature facilitates the distribution of data across clusters, enhancing read and write operations. This distributed approach ensures efficient data storage and improved fault tolerance.

Some of the key features of MongoDB include:

  • Aggregation Framework: MongoDB includes a robust aggregation framework that allows complex data transformations and analysis directly within the database. This framework supports filtering, grouping, sorting, and projecting data.
  • No-Schema Design: The no-schema design of MongoDB allows developers to update their data models without updating the entire database schema. This flexibility is beneficial in environments where requirements change frequently.
  • Ad-hoc Queries: Ad-hoc queries are short-lived queries whose values adapt to changing variables. MongoDB stands apart from other databases as it supports ad-hoc queries by field, range queries, and regular expression, optimizing real-time analytics.
  • Load Balancing: MongoDB can successfully handle multiple concurrent read and write requests due to its horizontal scaling features.

Elasticsearch Overview

mongodb to elasticsearch - elasticsearch

Image Source

Elasticsearch is a distributed, open-source search and analytics engine designed for horizontally scalable full-text search, data storage, and real-time analytics. It is a critical component of the Elastic Stack (formerly known as ELK Stack), which includes Logstash, Kibana, and Beats. Elasticsearch is used for various applications, such as log and event data analysis, web search engines, and business intelligence.

One of the notable features of Elasticsearch is that it facilitates rapid and accurate search operations, making it a preferred choice for applications that demand real-time insights. Its robust querying capabilities allow organizations to extract meaningful information quickly from vast amounts of structured or unstructured data. 

Some key features of Elasticsearch include:

  • Distributed and Scalable Architecture: Elasticsearch supports horizontal scaling of clusters by adding more nodes. It provides high availability and fault tolerance, making it suitable for large-scale applications.
  • Full-Text Search: It uses inverted indices to efficiently index and query large volumes of text data. This makes it useful for applications that require fast and accurate search across large datasets.
  • Searchable Snapshots: Elasticsearch allows you to query your snapshots directly at a much faster speed compared to the typical restore-from-snapshot speed.

Methods to Connect MongoDB to Elasticsearch in Real-Time

You can connect MongoDB to Elasticsearch using one of the three methods mentioned below.

Method 1: Using the MongoDB River Plugin to Connect MongoDB to Elasticsearch

You can use the Elasticsearch-River-MongoDB plugin to synchronize data from MongoDB to Elasticsearch. The River plugin monitors the operation log collection and automatically syncs it with Elasticsearch based on the configuration. After the synchronization process has finished, the indexes update within Elasticsearch.

How to use the Mongo River plugin:

Step 1: Install the Mongo River Plugin

  • Run the following command at the MongoDB installation location to install the plugin:
plaintext
bin/plugin --install com.github.richardwilly98.Elasticsearch/Elasticsearch-river-mongodb/2.0.9
  • After installation, make sure the connector is compatible with your version of Elasticsearch. You can check the compatibility here.

Step 2: Create the Indexing River

  • Use the following command to create the indexing River:
plaintext
curl -XPUT 'http://localhost:9200/_river/mongodb/_meta' -d '{  "type": "mongodb",  "mongodb": {    "db": "DATABASE_NAME",    "collection": "COLLECTION",    "gridfs": true  },  "index": {    "name": "ES_INDEX_NAME",    "type": "ES_TYPE_NAME"  } }'
  • Here’s an example with specific values:
plaintext
curl -XPUT 'http://localhost:9200/_river/mongodb/_meta' -d '{  "type": "mongodb",  "mongodb": {    "db": "testmongo",    "collection": "person"  },  "index": {    "name": "mongoindex",    "type": "person"  } }'
  • Check the indexed data in Elasticsearch using the following command:
plaintext
curl -XGET 'http://localhost:9200/files/4f230588a7da6e94984d88a1?pretty=true'

Method 2: Using Mongo Connector to Connect MongoDB to Elasticsearch

Mongo-Connector is a Python-based tool licensed by MongoDB. It’s designed for real-time synchronization, and it allows you to replicate the documents from MongoDB to various target systems, including Elasticsearch.

It creates a single MongoDB cluster pipeline to target systems like Elasticsearch. Mongo-Connector first copies the data from MongoDB to the target systems, and then it continues to do regular checks for updates to keep everything synced.

How to use Mongo Connector:

Step 1: Install Elasticsearch DocManager and Mongo Connector

Step 2: Create a MongoDB Replica Set

  • You can use Mongo Connector to create a replica set of MongoDB using the following command:
plaintext
mongod -- replSet myDevReplSet
  •  To initialize your server as a replica set, run the following command in the mongo shell:
plaintext
rs.initiate()
  • Initiate the connector once the replica set is running with the following command:
plaintext
mongo-connector -m <mongodb server hostname>:<replica set port> -t <replication endpoint URL, e.g. http://localhost:8983/es> -d <name of doc manager, e.g., Elasticsearch_doc_manager>

Method 3: Using Estuary Flow to Connect MongoDB to Elasticsearch

If you prefer an easier, code-free approach without the need for extensive libraries, you can use Estuary Flow. It’s a fully managed tool with a user-friendly interface so you can easily build data pipelines and connect MongoDB to Elasticsearch.

Estuary Flow provides built-in connectors to capture data from sources, including MongoDB. These connectors capture data into Flow collections and materialize these collections into databases. You can use the Elasticsearch connector to materialize data from Flow collections.

Follow this step-by-step guide to connect MongoDB to Elasticsearch:

Step 1: Configure MongoDB as the Source

  • Create a free Estuary Flow account here.
  • Navigate to Sources on the Estuary Flow dashboard and click + NEW CAPTURE.
  • Search for the MongoDB connector using the Search connectors field. When it appears in the search results, click its Capture button.
mongodb to elasticsearch - mongodb
  • On the MongoDB connector page, enter the capture details like server Address, User, and Password. Then, click NEXT > SAVE AND PUBLISH.
mongodb to elasticsearch - mongodb source configuration

Step 2: Configure Elasticsearch as the Destination

Once you successfully configure MongoDB as the source connector, a pop-up window with the capture details will display. In this pop-up window, click MATERIALIZE CONNECTIONS to start configuring the pipeline’s destination.

You can also navigate to the Destinations option on the left pane of the Estuary dashboard to begin configuring Elasticsearch as the destination.

  • Click the + NEW MATERIALIZATION button on the Destinations page.
  • Search for Elastic using the Search connectors field. When you see the connector in the search results, click on its Materialization button.
mongodb to elasticsearch - elasticsearch destination materialization
  • Enter the required configuration details like Name, Endpoint, Username, and Password. Click NEXT, and then SAVE AND PUBLISH.
mongodb to elasticsearch 0 elasticsearch destination configuration

Benefits of Using Estuary Flow

  • Pre-Built Connectors: Flow offers over 200+ pre-built connectors to connect to different databases. It helps avoid data migration issues by automating the entire process and allows you to establish quick connections across various databases.
  • Minimal Technical Expertise: Estuary Flow allows you to perform the migration process from MongoDB to Elasticsearch in just a few clicks. Making it easy for users with minimal technical expertise to migrate data.
  • Data Cleansing: Flow allows you to clean, filter, and validate your data during the migration process, ensuring data quality and integrity.

Seeking guidance on integrating MongoDB with other databases? Explore these integration options:

Conclusion

Establishing a real-time connection from MongoDB to Elasticsearch enhances data management and analysis. This integration allows for seamless collaboration between MongoDB’s document-oriented flexibility and Elasticsearch’s powerful search and analytics capabilities. 

By adopting one of the three methods—using the MongoDB River plugin, Mongo Connector, or Estuary Flow—you can seamlessly connect Elasticsearch to MongoDB. However, the manual methods can be time-consuming and labor-intensive.

Estuary Flow can help overcome these challenges with an intuitive interface and real-time streaming capabilities.

Revolutionize your data integration with Estuary Flow! Explore the potential of our user-friendly, no-code platform to connect MongoDB to Elasticsearch seamlessly in real time. Sign up for a free account today to access 150+ native, and 500+ 3rd party connectors and transform how you manage and analyze data.

FAQs

What is MongoDB used for?

MongoDB is a document-based database primarily used for developing scalable applications with evolving data schemas. It stores data in a JSON-like format and supports both structured and unstructured data.

How to move data from MongoDB to ElasticSearch?

To move data from MongoDB to ElasticSearch, you can use custom scripts for manual migration or low-code ETL tools like Estuary Flow to automate the migration process. Logstash is another option for a direct pipeline using input plugins for MongoDB and output plugins for ElasticSearch.

Which is faster, MongoDB or ElasticSearch?

While ElasticSearch excels in search-related operations, MongoDB delivers better scalability for handling large and complex data sets.

Start streaming your data for free

Build a Pipeline
Share this article

Table of Contents

Build a Pipeline

Start streaming your data for free

Build a Pipeline

About the author

Picture of Jeffrey Richman
Jeffrey Richman

With over 15 years in data engineering, a seasoned expert in driving growth for early-stage data companies, focusing on strategies that attract customers and users. Extensive writing provides insights to help companies scale efficiently and effectively in an evolving data landscape.

Popular Articles

Streaming Pipelines.
Simple to Deploy.
Simply Priced.
$0.50/GB of data moved + $.14/connector/hour;
50% less than competing ETL/ELT solutions;
<100ms latency on streaming sinks/sources.