Estuary

How to Stream Snowflake Data to Kafka – A Complete Guide

Stream Snowflake CDC data to Kafka in real time! Learn how to set up an automated, no-code pipeline or a custom manual integration for seamless event streaming.

Blog post hero image
Share this article

Real-time data movement between a cloud data warehouse and an event-driven platform is critical for modern data-driven applications. By streaming Snowflake data to Kafka, businesses can instantly:

  • Trigger alerts for security and operational insights.
  • Update real-time dashboards for better decision-making.
  • Synchronize Snowflake data with microservices, databases, and AI models.

For example, a fraud detection system monitoring financial transactions in Snowflake can instantly publish suspicious activity to Kafka, alerting security teams in real time.

How Does Snowflake CDC to Kafka Work?

To stream data efficiently, we use Change Data Capture (CDC)—a method that tracks inserts, updates, and deletes in Snowflake and immediately pushes changes into Kafka topics. No more batch delays—your data flows in real time!

Two Ways to Move Data from Snowflake to Kafka

This guide explores two powerful methods to set up a Snowflake-to-Kafka pipeline:

  1. Estuary Flow (No-Code, Real-Time CDC) – A fully automated, low-latency pipeline that connects Snowflake to Kafka without any coding.
  2. Manual Integration (Snowflake Streams + Custom Scripts) – A DIY approach using Snowflake Streams, AWS Lambda, and Kafka Connect to stream data.

While both methods can move data, Estuary Flow offers a faster, more reliable, and low-maintenance approach.

Let’s dive in! 

Method 1: Streaming Snowflake to Kafka with Estuary Flow (No-Code Approach)

When it comes to real-time data streaming from Snowflake to Kafka, a no-code solution like Estuary Flow offers the most efficient, scalable, and low-latency approach. Unlike traditional manual methods that require custom scripts, Snowflake Streams, and batch processing, Estuary Flow simplifies the process with an automated Change Data Capture (CDC) pipeline—ensuring real-time synchronization without the complexity.

Why Choose Estuary Flow for Snowflake CDC to Kafka?

  • Fully Automated CDC (Change Data Capture): No need to manually track inserts, updates, and deletes.
  • Real-Time Streaming: Moves Snowflake data to Kafka with sub-100ms latency.
  • No-Code Setup: Eliminates the need for writing complex SQL, Python, or Kafka Connect scripts.
  • Scalable and Reliable: Handles large-scale data transfers without performance bottlenecks.
  • Schema Evolution Support: Automatically adjusts for new columns and data structure changes.

Now, let’s walk through the step-by-step process of setting up a real-time Snowflake to Kafka pipeline using Estuary Flow.

Prerequisites

Before getting started, ensure you have:

  • An Estuary Flow account (Sign up for free at Estuary)
  • Snowflake database with read access and necessary privileges
    • Tip: To quickly create a Snowflake user with the right privileges for Estuary, check out our example setup script
  • Kafka cluster (self-hosted, AWS MSK, or Confluent Cloud)
  • Kafka authentication credentials (if using SASL or TLS security)

Or, if you’d rather send your data directly to a Kafka consumer, like Tinybird or ClickHouse, check out Estuary’s Dekaf connector instead. With Dekaf, Estuary acts as the Kafka broker and hosts the associated schema registry for you.

Step 1: Configure Snowflake as a Source in Estuary Flow

1. Log in to Estuary Flow

  • Visit Estuary Flow and sign in to your account.
  • Navigate to the Dashboard.

2. Add Snowflake as a Source

Search for the Snowflake capture connector
  • On the left panel, click Sources.
  • Click the + NEW CAPTURE button.
  • In the Search Connectors box, type Snowflake.
  • Select Snowflake Data Cloud and click Capture.

3. Configure the Snowflake Connector

On the connector configuration page, enter:

Snowflake capture configuration
  • Name: Give a unique name to your data capture pipeline.
  • Snowflake Host URL: The unique Snowflake instance URL.
  • Account: The Snowflake account identifier.
  • Database Name: Specify the Snowflake database containing the data.
  • Warehouse Name: Choose the Snowflake compute warehouse that will handle queries.
  • User & Password: Provide authentication credentials.
  • Schema Name: (Advanced/Optional) The schema in which Flow will manage its streams and tables.

Once done, click NEXT > SAVE AND PUBLISH.

🔹 The Snowflake CDC connector will now track all inserts, updates, and deletes in real time and store Snowflake tables as collections in Estuary Flow.

Step 2: Configure Kafka as a Destination in Estuary Flow

1. Navigate to the Destinations Tab

Search for the Kafka materialization connector
  • After setting up the Snowflake capture, you’ll see a summary popup.
  • Click MATERIALIZE COLLECTIONS to begin configuring Kafka as the destination.
  • Alternatively, go to Destinations in the left menu and click + NEW MATERIALIZATION.

2. Add Kafka as a Destination

  • In the Search Connectors field, type Kafka.
  • Select the Kafka Connector and click Materialization.

3. Configure the Kafka Connector

On the Create Materialization page, enter:

Kafka materialization configuration
  • Name: A unique identifier for the Kafka materialization.
  • Bootstrap Servers: The Kafka broker endpoints (e.g., broker1:9092,broker2:9092).
  • Topic Name: The Kafka topic where Snowflake data will be streamed.
  • Serialization Format: Choose JSON or Avro for message encoding.
    • Note that the Avro format will also require a schema registry.
  • Authentication Details (if required):
    • Auth Type: SASL, TLS, or other supported mechanisms.
    • Username & Password (if using SASL authentication).
  • Estuary Flow will automatically map Snowflake data collections to Kafka topics.
  • If needed, manually link the source collections by selecting SOURCE FROM CAPTURE.

Click NEXT > SAVE AND PUBLISH to start streaming Snowflake CDC to Kafka in real time.

How the Kafka Connector Works in Estuary Flow

  • The Kafka materialization connector streams new, updated, and deleted Snowflake records to Kafka topics.
  • Delta updates ensure only changed data is transferred, reducing network and processing costs.
  • The connector automatically handles retries and error recovery, ensuring data consistency.

Streaming Snowflake data to Kafka in real time no longer needs complex engineering work. With Estuary Flow, you get a fully automated, no-code CDC solution that moves live Snowflake data to Kafka topics effortlessly.

🚀 Ready to simplify your Snowflake-to-Kafka pipeline?
Try Estuary Flow for a faster, smarter, and more scalable way to stream Snowflake CDC to Kafka.

Want to see a manual approach instead? Keep reading for Method 2: DIY Snowflake to Kafka Integration.

Method 2: Manually Streaming Snowflake to Kafka (DIY Approach)

If you prefer a custom-built pipeline without third-party ETL tools, you can manually integrate Snowflake CDC with Kafka. This approach involves:

  1. Capturing Snowflake data changes using Snowflake Streams.
  2. Extracting new records using scheduled queries.
  3. Pushing data to Kafka via Python, AWS Lambda, or Kafka Connect.

While this method provides more control, it also requires significant engineering effort for setup, maintenance, and scalability.

Prerequisites

Before proceeding, ensure you have:

  • A Snowflake account with admin privileges
  • A Kafka cluster (self-hosted, AWS MSK, or Confluent Cloud)
  • Kafka authentication credentials (if using SASL, TLS, etc.)
  • A compute environment (AWS Lambda, Python, or Java)

Step 1: Enable Snowflake CDC Using Snowflake Streams

1. Create a Table in Snowflake

Or use an existing table.

plaintext
CREATE TABLE transactions (    id INT PRIMARY KEY,    user_id INT,    amount DECIMAL(10,2),    transaction_date TIMESTAMP );

2. Create a Snowflake Stream

Stream in Snowflake tracks INSERTS, UPDATES, and DELETES to a table.

plaintext
CREATE STREAM transactions_stream ON TABLE transactions APPEND_ONLY = FALSE;

This stream captures every change in the transactions table.

3. Verify Streamed Changes

Check the Snowflake CDC queue:

plaintext
SELECT * FROM transactions_stream;

Step 2: Extract Data from Snowflake & Format for Kafka

1. Set Up a Scheduled Task to Export Data

Snowflake doesn’t push data automatically, so we need a scheduled task to extract new records.

plaintext
CREATE OR REPLACE TASK export_to_kafka SCHEDULE = '1 MINUTE' AS INSERT INTO staging_table SELECT * FROM transactions_stream;

This task moves CDC changes to a staging table every minute.

Step 3: Load Data from Snowflake to Kafka

At this stage, we need a Kafka Producer to pick up staged data and push it into Kafka. This can be done via:

  • Python with Kafka-Python
  • AWS Lambda function
  • Kafka Connect (JDBC Source Connector)

Option 1: Using Python to Push Data to Kafka

python
from kafka import KafkaProducer import snowflake.connector import json # Connect to Snowflake conn = snowflake.connector.connect(    user='your_username',    password='your_password',    account='your_account' ) # Fetch new transactions cur = conn.cursor() cur.execute("SELECT * FROM transactions_stream") rows = cur.fetchall() # Kafka Producer producer = KafkaProducer(    bootstrap_servers=['broker1:9092''broker2:9092'],    value_serializer=lambda v: json.dumps(v).encode('utf-8') ) # Publish to Kafka for row in rows:    producer.send('transactions_topic', row) print("Data sent to Kafka!") # Close connections cur.close() conn.close()

✅ This script:

  • Pulls new data from Snowflake Streams
  • Converts it into JSON format
  • Publishes records to Kafka topics

Option 2: Using Kafka Connect JDBC Source Connector

Alternatively, you can use Kafka Connect to sync data:

plaintext
{ "name": "snowflake-source-connector", "config": {    "connector.class": "io.confluent.connect.jdbc.JdbcSourceConnector",    "connection.url": "jdbc:snowflake://your_snowflake_url",    "connection.user": "your_username",    "connection.password": "your_password",    "table.whitelist": "transactions_stream",    "mode": "incrementing",    "incrementing.column.name": "id",    "topic.prefix": "snowflake-" } }

Kafka Connect automates ingestion but still requires manual configuration.

Step 4: Validate Kafka Messages

Once data is pushed to Kafka, you can validate it using the Kafka CLI:

plaintext
kafka-console-consumer --bootstrap-server broker1:9092 --topic transactions_topic --from-beginning

✅ This confirms data is streaming from Snowflake to Kafka successfully.

Key Challenges with the Manual Approach

  • Not Truly Real-Time – Data updates depend on scheduled queries instead of live event streaming.
  • High Engineering Effort – Requires SQL, Python, Kafka Connect, and DevOps knowledge.
  • Complex Debugging – Harder to troubleshoot failures, duplicates, and missing data.
  • No Built-In Schema Handling – Changes to Snowflake require manual adjustments in Kafka.

Snowflake to Kafka: Manual Approach vs. Automate

FeatureManual Setup (Snowflake Streams + Scripts)Estuary Flow (No-Code CDC)
Ease of UseComplex setup, requires SQL + PythonNo-Code UI, set up in minutes
Real-Time CDCDelayed (Scheduled Queries)Sub-100ms latency
Schema EvolutionNeeds manual adjustmentsAuto-detects schema changes
ScalabilityHard to scaleBuilt for large-scale pipelines
MaintenanceRequires monitoring & debuggingFully managed & automated
ReliabilityRisk of data lossBuilt-in error handling

 

Final Thoughts: The Best Way to Stream Snowflake Data to Kafka

When choosing a Snowflake to Kafka integration, your decision comes down to efficiency, scalability, and ease of use.

🔹 Manual integration (Snowflake Streams + Kafka Connect) offers customization, but it’s complex, time-consuming, and requires ongoing maintenance. You’ll need to write and maintain scripts, troubleshoot issues, and manually handle schema changes—which slows down real-time analytics.

🔹 Estuary Flow, on the other hand, provides a fully automated, no-code CDC solution that moves Snowflake data to Kafka in real time. It eliminates batch processing delays, handles schema evolution seamlessly, and ensures sub-100ms data latency—all without writing a single line of code.

Why struggle with manual setups? Get started with Estuary Flow today and unlock effortless, real-time data streaming from Snowflake to Kafka!


Want to explore more? Check out these related guides:

Start streaming your data for free

Build a Pipeline
Share this article

Table of Contents

Start Building For Free

About the author

Picture of Jeffrey Richman
Jeffrey Richman

With over 15 years in data engineering, a seasoned expert in driving growth for early-stage data companies, focusing on strategies that attract customers and users. Extensive writing provides insights to help companies scale efficiently and effectively in an evolving data landscape.

Popular Articles

Streaming Pipelines.
Simple to Deploy.
Simply Priced.
$0.50/GB of data moved + $.14/connector/hour;
50% less than competing ETL/ELT solutions;
<100ms latency on streaming sinks/sources.