
Most modern applications need more than simple key-value lookups. They need full-text search, filtering across multiple fields, faceted search, autocomplete, or near real-time analytics over operational data. DynamoDB is excellent for low-latency transactional workloads, but it is not designed to be a full-text search engine.
A DynamoDB to Elasticsearch pipeline solves this by keeping DynamoDB as the primary application database and using Elasticsearch as the dedicated search and analytics index. As items are inserted, updated, or deleted in DynamoDB, those changes can be streamed into Elasticsearch so users can search fresh application data quickly.
In this post, we’ll look at two reliable methods to stream data from DynamoDB to Elasticsearch: using Estuary and using AWS Lambda with DynamoDB Streams.
If you are indexing more than DynamoDB, see the ways to get data into Elasticsearch, with a method-by-method comparison and a decision table.
How to Stream Data From DynamoDB to Elasticsearch
There are two methods you can use to stream data from DynamoDB to Elasticsearch:
- Method 1: Using Estuary for Streaming DynamoDB to Elasticsearch
- Method 2: Using AWS Lambda for DynamoDB Stream to Elasticsearch
| Method | Best for | Freshness | Complexity |
|---|---|---|---|
| Estuary | Managed DynamoDB to Elasticsearch pipelines using DynamoDB Streams | Real-time or near real-time | Low |
| AWS Lambda | Teams building custom AWS-native stream processing | Real-time or near real-time | Medium to high |
Method 1: Using Estuary for Streaming DynamoDB to Elasticsearch
Estuary can stream DynamoDB changes into Elasticsearch using DynamoDB Streams. Once streams are enabled on the DynamoDB tables you want to capture, Estuary continuously captures inserts, updates, and deletes into Estuary collections and then materializes those collections into Elasticsearch indices.
Prerequisites
- An Estuary account.
- One or more DynamoDB tables with DynamoDB Streams enabled.
- AWS credentials with permission to discover and read the relevant DynamoDB tables and streams.
- An Elasticsearch cluster with a known endpoint.
- An Elasticsearch role with the required privileges for the target indices.
- Network access between Estuary, DynamoDB, and Elasticsearch.
Estuary’s DynamoDB connector requires access to list tables in the AWS region. If you see an AccessDeniedException, check whether the IAM policy allows dynamodb:ListTables using the required table resource pattern.
Step 1: Configure DynamoDB as the Source
- Login to your Estuary account.
- Click on the Sources tab on the left navigation pane.
- Click on the + NEW CAPTURE button.
- Next, search for DynamoDB using the Search connectors field and click the connector’s Capture button to begin configuring it as the data source.
- On the Create Capture page, enter the specified details like Name, Access Key ID, Secret Access Key, and Region.
- After filling in the required fields, click on NEXT > SAVE AND PUBLISH. This will capture data from DynamoDB into Estuary collections.
Step 2: Configure Elasticsearch as the Destination
- Once the source is set, click MATERIALIZE COLLECTIONS in the pop-up window or the Destinations option on the dashboard.
- Click on the + NEW MATERIALIZATION button on the Destinations page.
- Type Elastic in the Search connectors box and click on the Materialization button of the connector when you see it in the search results.
- On the Create Materialization page, enter the details like Name, Endpoint, Username, Password, and Index Replicas.
- If your collection of data from DynamoDB isn’t filled automatically, you can add it manually using the Link Capture button in the Source Collections section.
- Finally, click on NEXT > SAVE AND PUBLISH to materialize data from your Flow collections to Elasticsearch.
- With the source and destination configured, Estuary will begin loading data from the Flow collections to Elasticsearch.
The Elasticsearch user or API key used by Estuary should have the monitor cluster privilege and read, write, view_index_metadata, and create_index privileges for the target indices.
If you need deleted DynamoDB items to be removed from Elasticsearch search results, review the Elasticsearch connector’s delete behavior. Estuary tracks delete events with
_meta/opset tod; depending on the destination configuration, you may want hard deletes instead of soft-delete markers.
Benefits of Using Estuary
Here are some of the benefits of Estuary.
- No-code Configuration: Powerful no-code tools like Estuary are designed to be user-friendly and do not require extensive technical expertise to configure the source and destination. This is due to over 200 connectors that help simplify the process in just a few clicks.
- Real-time Data Processing With CDC: Estuary leverages Change Data Capture (CDC) for real-time data processing. This helps maintain data integrity and reduces latency.
- Scalability: Estuary is designed to handle large data flows and supports up to 7 GB/s. This flow makes it highly scalable as data usage in DynamoDB and Elasticsearch increases.
- Efficient Data Transformations: Estuary supports TypeScript and SQL transformations. By leveraging Typescript, Estuary can prevent common pipeline failures and enable fully type-checked data pipelines, which is crucial for ensuring data integrity during migration. In addition, the platform’s native SQL transformations provide an easy-to-use alternative for reshaping, filtering, and rejoining data in real time, which is essential for maintaining data consistency and accuracy.
Method 2: Using AWS Lambda for DynamoDB Stream to Elasticsearch
Streaming data from DynamoDB to Elasticsearch can significantly enhance your application’s search capabilities. Here are the detailed steps involved in this method that uses AWS Lambda for the integration.
Step 1: Create Your DynamoDB Table With Streams Enabled
- Create a DynamoDB table in the AWS Management Console.
- Enable DynamoDB Streams on the table and set the stream view type to New Image.
Step 2: Create an IAM Role for Lambda Execution
- Your Lambda function needs permission to read from DynamoDB and write to your Elasticsearch domain.
- Create an IAM Role with policies with permissions for Amazon Elasticsearch Service (ES), DynamoDB, and Lambda execution.
Here’s an example:
json{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"es:ESHttpPost",
"es:ESHttpPut",
"dynamodb:DescribeStream",
"dynamodb:GetRecords",
"dynamodb:GetShardIterator",
"dynamodb:ListStreams",
"logs:CreateLogGroup",
"logs:CreateLogStream",
"logs:PutLogEvents"
],
"Resource": "*"
}
]
}Step 3: Create an Elasticsearch Domain
- In the AWS Management Console, create an Amazon OpenSearch Service domain. AWS renamed Amazon Elasticsearch Service to Amazon OpenSearch Service, though older AWS examples and existing domains may still use Elasticsearch terminology.
Note: AWS has transitioned Elasticsearch service to Amazon OpenSearch Service. However, existing domains continue to be referred to as Elasticsearch domains.
- Configure the domain settings as needed, including access policies to allow the Lambda function to post data.
Step 4: Create a Lambda Function
- Create a Lambda function by choosing a runtime (e.g., Python, Node.JS, etc.).
- Write the function code to process records from DynamoDB streams and post them to Elasticsearch.
Here is a sample code in Python.
pythonimport boto3
import requests
from requests_aws4auth import AWS4Auth
region = 'us-east-1'
service = 'es'
credentials = boto3.Session().get_credentials()
awsauth = AWS4Auth(credentials.access_key, credentials.secret_key, region, service, session_token=credentials.token)
host = 'https://search-ddb-to-es-r7dcdoy4caeoklst3yseumqmre.us-east-1.es.amazonaws.com' # the Amazon ES domain, with https://
index = 'lambda-index'
type = 'lambda-type'
url = host + '/' + index + '/_doc/'
headers = { "Content-Type": "application/json" }
def handler(event, context):
count = 0
for record in event['Records']:
# Get the primary key for use as the Elasticsearch ID
id = record['dynamodb']['Keys']['id']['S']
if record['eventName'] == 'REMOVE':
r = requests.delete(url + id, auth=awsauth)
else:
document = record['dynamodb']['NewImage']
r = requests.put(url + id, auth=awsauth, json=document, headers=headers)
count += 1
return str(count) + ' records processed.'Note: In production, convert DynamoDB AttributeValue objects into normal JSON before indexing, choose a stable document ID from the table key, and use the bulk API for batches instead of one request per record.
Step 5: Configure The DynamoDB Stream Trigger
- In the Lambda function’s trigger, add a new trigger.
- Select DynamoDB as the trigger type and choose the DynamoDB table created in Step 1.
Step 6: Test the Setup
- After the setup, make changes to your Amazon DynamoDB table and verify that the changes are reflected in your Elasticsearch domain.
- You can use Kibana or Elasticsearch API to query or visualize the data and ensure it matches the changes made in DynamoDB.
These are the steps for completing a DynamoDB stream to Elasticsearch using AWS Lambda. However, this method has several limitations.
- DynamoDB Streams 24-hour processing Limit: DynamoDB stream retains data for 24 hours only. If the Lambda function fails to process records within this time frame, those records will be lost permanently.
- Lambda Function Code and Dependencies: As your data streaming requirements evolve, you’ll need to update your Lambda function to handle schema changes, add error handling, etc., which can add extra operational overhead.
- Technical Expertise: Building the custom Lambda functions requires extensive knowledge in both programming and the AWS ecosystem, which can be a setback for non-technical users.
- No automatic historical backfill: DynamoDB Streams starts capturing changes after streams are enabled. If you need existing table data in Elasticsearch, you must run a separate backfill.
- Batch failure handling: Lambda retries failed batches, so one bad record can block progress unless you configure partial batch response, retries, and a DLQ.
- Mapping conflicts: DynamoDB’s flexible item structure can create Elasticsearch mapping conflicts if the same attribute appears with different types.
- Indexing throughput: High-write DynamoDB tables may require batching, bulk indexing, concurrency tuning, and backpressure handling.
Key Takeaways
DynamoDB stream to Elasticsearch provides a significant increase in performance and scalability. While using the AWS Lambda function can help implement this, it can be time-consuming and requires extensive technical expertise, making it prone to errors.
Estuary is an excellent solution for those who want an easy and automated way to stream data from DynamoDB to Elasticsearch without the need for extensive technical knowledge. The method you choose depends on your needs and level of expertise.
If your Elasticsearch project also includes document databases, see our guide to syncing MongoDB to Elasticsearch.
Estuary provides an extensive and growing list of connectors, robust functionalities, and a user-friendly interface. Sign up today to simplify and automate DynamoDB stream to Elasticsearch.
FAQs
Is DynamoDB Streams the same as Kinesis Data Streams?

About the author
Jeffrey is a data engineering professional with over 15 years of experience, helping early-stage data companies scale by combining technical expertise with growth-focused strategies. His writing shares practical insights on data systems and efficient scaling.












