Estuary

How to Transfer Data From Redshift to Elasticsearch (2 Methods)

Want to move data from Amazon Redshift to Elasticsearch? This blog comprehensively discusses two popular methods for a seamless workflow.

redshift to elasticsearch - blog hero image
Share this article

Amazon Redshift is built for large-scale warehouse analytics, but it is not designed to serve low-latency full-text search or application search directly. Elasticsearch fills that gap by indexing Redshift data into searchable documents for fast filtering, search, dashboards, and operational analytics.

A Redshift to Elasticsearch pipeline is useful when curated warehouse data needs to power customer-facing search, internal lookup tools, reporting dashboards, or fast exploration outside the warehouse.

In this guide, we’ll compare two ways to move Redshift data into Elasticsearch: using Estuary’s Amazon Redshift Batch Query connector and using custom scripts with S3 and Elasticsearch bulk loading.

Redshift is one of many sources you can sync; our overview of ingesting data into Elasticsearch covers the available methods and when each one fits.

Redshift to Elasticsearch Data Mapping Considerations

Before loading Redshift data into Elasticsearch, decide how warehouse rows should become Elasticsearch documents. Redshift stores relational tables optimized for analytical SQL queries, while Elasticsearch stores denormalized JSON documents optimized for search, filtering, and aggregations.

Key decisions include:

  • Document IDs: Use a stable primary key or business key from Redshift as the Elasticsearch document ID so repeated syncs update existing documents instead of creating duplicates.
  • Text vs keyword fields: Use text for fields that need full-text search, such as names, descriptions, notes, or comments. Use keyword for filters, exact matches, IDs, status values, categories, and aggregations.
  • Dates and timestamps: Format Redshift date, timestamp, and timestamptz values consistently before indexing.
  • Numeric fields: Map Redshift integers, decimals, floats, and doubles to appropriate Elasticsearch numeric types. Be careful with high-precision decimal values.
  • Denormalization: If Elasticsearch queries need fields from multiple Redshift tables, join or transform that data before indexing so each document contains the fields needed for search.
  • Deletes: Decide how records removed from Redshift should be handled in Elasticsearch. For batch query pipelines, you may need a soft-delete field, a complete index rebuild, or a query that emits deletion state.
  • Schema changes: Plan how new columns, renamed fields, or changed data types in Redshift should be handled downstream.

For production search applications, define Elasticsearch mappings intentionally instead of relying only on dynamic mapping. This helps avoid mapping conflicts and improves filtering, sorting, aggregations, and relevance tuning.

How to Connect Redshift to Elasticsearch: 2 Methods

There are two practical ways to move Redshift data into Elasticsearch. Estuary is better for managed recurring sync from Redshift queries, while custom scripts are better suited for one-time exports, prototypes, or teams that want to fully own the pipeline.

MethodBest forFreshnessComplexity
Estuary Amazon Redshift Batch QueryManaged recurring sync from Redshift query resultsScheduled or recurring, depending on capture configurationLow to medium
Custom scripts with S3 and Elasticsearch bulk loadingOne-time loads or fully custom batch pipelinesManual or scheduled batchHigh
  • Method 1: Using Estuary to sync Redshift query results to Elasticsearch
  • Method 2: Using custom scripts to export Redshift data to S3 and load it into Elasticsearch

Redshift to Elasticsearch - CTA Banner

Method 1: Using Estuary to Achieve Redshift to Elasticsearch Integration

Estuary is a managed data integration platform that can capture Redshift query results into Estuary collections and materialize those collections into Elasticsearch indices. For Redshift, Estuary uses the Amazon Redshift Batch Query connector, which periodically executes configured queries and converts the results into JSON documents.

Prerequisites

  • An Estuary account.
  • An Amazon Redshift cluster or Redshift Serverless workgroup accessible from Estuary.
  • Redshift connection details, including host, port, database, username, and password.
  • SQL queries that return the Redshift data you want to capture.
  • An Elasticsearch cluster with a known endpoint.
  • An Elasticsearch user or API key with the required destination privileges.
  • Network access between Estuary, Redshift, and Elasticsearch, including allowlisting where required.

Step 1: Configure Redshift as the Source

Redshift to elasticsearch - Estuary Main Dashboard
  • Sign in to your Estuary account and select Sources from the main dashboard.
Redshift to elasticsearch - New Capture Page
  • On the Sources page, click on + NEW CAPTURE.
  • In the Search connectors field, type Amazon Redshift. Once it appears, click on the Capture button of Amazon Redshift Batch. 
Redshift to elasticsearch - Redshift Create Capture Page
  • On the Create Capture page, provide the Redshift connection details and query configuration. At minimum, include a unique capture name, host or server address, port, database, user, password, and the query or queries Estuary should run to capture data into collections.

The Amazon Redshift Batch Query connector will periodically execute the configured query and capture the result rows into Estuary collections as JSON documents.

Step 2: Configure Elasticsearch as the Destination

  • After setting up Redshift as the source end of the ETL pipeline, you must configure Elasticsearch as the destination. To do this, go to the main dashboard and click on Destinations from the left-side panel.
Redshift to elasticsearch - New Materialization
  • Click on + NEW MATERIALIZATION on the Destinations page.
Redshift to elasticsearch - Elastic search connector page
  • In the Search connectors box, type Elastic. When you see the connector in the search results, click on its Materialization button.
Redshift to elasticsearch - elasticsearch materialization page
  • On the Create Materialization page, enter your NameEndpoint, Username, and Password.
  • While collections added to your capture will automatically get added to your materialization, you can use the Source Collections section to link a capture to your materialization.
  • Click on NEXT > SAVE AND PUBLISH.

The Elasticsearch role used by Estuary should have the monitor cluster privilege and read, write, view_index_metadata, and create_index privileges for the target indices.

Estuary’s Elasticsearch connector materializes Estuary collections into Elasticsearch indices. It can automatically create indices for materialization bindings and lets you customize the target index name when needed.

Why Use Estuary for Redshift to Elasticsearch

  • Managed Redshift capture: Estuary’s Amazon Redshift Batch Query connector captures query results from Redshift into Flow collections.
  • Recurring sync: Configure Redshift queries to run on a schedule so Elasticsearch can be refreshed without manually exporting files.
  • Elasticsearch materialization: Estuary materializes captured collections into Elasticsearch indices without requiring you to build and maintain custom bulk loading scripts.
  • Schema-aware pipelines: Estuary collections have schemas, which helps validate data before it reaches Elasticsearch.
  • Transformations: You can reshape, filter, or enrich captured collections before materializing them into Elasticsearch.
  • Operational simplicity: Estuary reduces the need to manage S3 export jobs, Python loaders, retries, and indexing scripts yourself.

Method 2: Using Custom Scripts to Load Data From Redshift to Elasticsearch

You can use custom scripts to extract data from Redshift and load it into an S3 bucket and then into Elasticsearch. Let’s look into the steps to do this.

Step 1: Load Data from Amazon Redshift to S3

In this step, you can use the SQL ‘UNLOAD’ command to export data from Redshift to S3.

sql
UNLOAD ('SELECT * FROM your_schema.your_table') TO 's3://your-s3-bucket/your-folder/' IAM_ROLE 'arn:aws:iam::123456789012:role/YourRedshiftUnloadRole' DELIMITER ',' ALLOWOVERWRITE HEADER;

For production exports, prefer an IAM role over embedding AWS access keys in SQL. Also consider exporting as Parquet or JSON depending on how your Elasticsearch loading script will parse and index the data.

For this, you must:

  • Replace table with your Redshift table name.
  • Provide the details of the S3 bucket where you will be loading the data.
  • Provide your AWS credentials to access the S3 bucket. 

Step 2: Load Data from S3 to Elasticsearch 

In this step, you use a Python script to load data from S3 to Elasticsearch. 

Redshift to elasticsearch - Manual code
Image Source
Redshift to elasticsearch - Export data from S3 to elasticsearch

Here, the boto3 and Lambda functions can be used to export data from the S3 bucket to Elasticsearch. This completes the custom method of loading data from Amazon Redshift to Elasticsearch. 

Limitations of Using Custom Scripts to Load Data from Redshift to Elasticsearch

  • No built-in incremental sync: You must design your own logic for detecting new, updated, and deleted Redshift records.
  • Delete handling: Records removed from Redshift will not automatically disappear from Elasticsearch unless your script explicitly handles deletes.
  • Mapping conflicts: Redshift column types and Elasticsearch field mappings may not align cleanly, especially for text, timestamps, and high-precision decimals.
  • Operational overhead: You must own S3 exports, Lambda or worker execution, retries, error handling, monitoring, credentials, and alerting.
  • Performance tuning: Large exports may require chunking, compression, bulk API tuning, index refresh tuning, and backpressure handling.
  • Cost: Redshift query execution, S3 storage, data transfer, compute, and Elasticsearch indexing all add operational cost.

The Takeaway

Connecting Redshift to Elasticsearch lets teams keep Redshift as the warehouse for analytical SQL while using Elasticsearch for fast search, filtering, and operational analytics.

Custom scripts can work for one-time exports or highly customized batch pipelines, but they require you to manage Redshift UNLOAD jobs, S3 files, parsing, Elasticsearch bulk indexing, retries, monitoring, and delete handling.

Estuary is a better fit when you want a managed way to capture Redshift query results and materialize them into Elasticsearch indices without maintaining custom scripts.

Redshift is one of many sources you can sync; our overview of ingesting data into Elasticsearch covers the available methods and when each one fits.

If your Elasticsearch project also includes other warehouse sources, see our guides to syncing Snowflake to Elasticsearch and moving BigQuery data into Elasticsearch.

Sign up for Estuary to start building your Redshift to Elasticsearch pipeline.

FAQs

  1. How is data stored in Redshift?

Amazon Redshift stores data in columnar format. Each data block holds values from a single column of the table. This allows Redshift to read only those columns that are required for query execution, improving performance significantly. 

  1. What data type does Elasticsearch support?

Elasticsearch supports different data types, such as the core data type, which includes string, date, numeric, boolean, and binary. It also supports complex data types, such as arrays, objects, nested, and geo data types.

  1. What are some of the prominent use cases of Elasticsearch? 

Elasticsearch is a popular analytical and search engine for storing, searching, and analyzing structured and unstructured data. It is commonly used for full-text search, log analytics, business analytics, security, and operational intelligence.

Start streaming your data for free

Build a Pipeline

About the author

Picture of Rob Meyer
Rob MeyerTechnical Product Marketing (Data & Integration)

Rob is a technical product marketing leader with expertise in data engineering, databases, and integration technologies. He has previously worked with WSO2, Firebolt, Imply, GridGain, Axway, Informatica, and TIBCO, focusing on data platforms, APIs, and real-world data movement solutions.

Streaming Pipelines.
Simple to Deploy.
Simply Priced.
$0.50/GB of data moved + $.14/connector/hour;
50% less than competing ETL/ELT solutions;
<100ms latency on streaming sinks/sources.