snowflakeKafka

9 min read

Last updated: March 26, 2025

How to Stream Snowflake Data to Kafka – A Complete Guide

Stream Snowflake CDC data to Kafka in real time! Learn how to set up an automated, no-code pipeline or a custom manual integration for seamless event streaming.

Jeffrey Richman

Share this article

Real-time data movement between a cloud data warehouse and an event-driven platform is critical for modern data-driven applications. By streaming Snowflake data to Kafka, businesses can instantly:

Trigger alerts for security and operational insights.
Update real-time dashboards for better decision-making.
Synchronize Snowflake data with microservices, databases, and AI models.

For example, a fraud detection system monitoring financial transactions in Snowflake can instantly publish suspicious activity to Kafka, alerting security teams in real time.

How Does Snowflake CDC to Kafka Work?

To stream data efficiently, we use Change Data Capture (CDC)—a method that tracks inserts, updates, and deletes in Snowflake and immediately pushes changes into Kafka topics. No more batch delays—your data flows in real time!

Two Ways to Move Data from Snowflake to Kafka

This guide explores two powerful methods to set up a Snowflake-to-Kafka pipeline:

Estuary Flow (No-Code, Real-Time CDC) – A fully automated, low-latency pipeline that connects Snowflake to Kafka without any coding.
Manual Integration (Snowflake Streams + Custom Scripts) – A DIY approach using Snowflake Streams, AWS Lambda, and Kafka Connect to stream data.

While both methods can move data, Estuary Flow offers a faster, more reliable, and low-maintenance approach.

Let’s dive in!

Method 1: Streaming Snowflake to Kafka with Estuary Flow (No-Code Approach)

When it comes to real-time data streaming from Snowflake to Kafka, a no-code solution like Estuary Flow offers the most efficient, scalable, and low-latency approach. Unlike traditional manual methods that require custom scripts, Snowflake Streams, and batch processing, Estuary Flow simplifies the process with an automated Change Data Capture (CDC) pipeline—ensuring real-time synchronization without the complexity.

Why Choose Estuary Flow for Snowflake CDC to Kafka?

Fully Automated CDC (Change Data Capture): No need to manually track inserts, updates, and deletes.
Real-Time Streaming: Moves Snowflake data to Kafka with sub-100ms latency.
No-Code Setup: Eliminates the need for writing complex SQL, Python, or Kafka Connect scripts.
Scalable and Reliable: Handles large-scale data transfers without performance bottlenecks.
Schema Evolution Support: Automatically adjusts for new columns and data structure changes.

Now, let’s walk through the step-by-step process of setting up a real-time Snowflake to Kafka pipeline using Estuary Flow.

Prerequisites

Before getting started, ensure you have:

An Estuary Flow account (Sign up for free at Estuary)
A Snowflake database with read access and necessary privileges
- Tip: To quickly create a Snowflake user with the right privileges for Estuary, check out our example setup script
A Kafka cluster (self-hosted, AWS MSK, or Confluent Cloud)
Kafka authentication credentials (if using SASL or TLS security)

Or, if you’d rather send your data directly to a Kafka consumer, like Tinybird or ClickHouse, check out Estuary’s Dekaf connector instead. With Dekaf, Estuary acts as the Kafka broker and hosts the associated schema registry for you.

Step 1: Configure Snowflake as a Source in Estuary Flow

1. Log in to Estuary Flow

Visit Estuary Flow and sign in to your account.
Navigate to the Dashboard.

2. Add Snowflake as a Source

Search for the Snowflake capture connector

On the left panel, click Sources.
Click the + NEW CAPTURE button.
In the Search Connectors box, type Snowflake.
Select Snowflake Data Cloud and click Capture.

3. Configure the Snowflake Connector

On the connector configuration page, enter:

Name: Give a unique name to your data capture pipeline.
Snowflake Host URL: The unique Snowflake instance URL.
Account: The Snowflake account identifier.
Database Name: Specify the Snowflake database containing the data.
Warehouse Name: Choose the Snowflake compute warehouse that will handle queries.
User & Password: Provide authentication credentials.
Schema Name: (Advanced/Optional) The schema in which Flow will manage its streams and tables.

Once done, click NEXT > SAVE AND PUBLISH.

🔹 The Snowflake CDC connector will now track all inserts, updates, and deletes in real time and store Snowflake tables as collections in Estuary Flow.

Step 2: Configure Kafka as a Destination in Estuary Flow

1. Navigate to the Destinations Tab

Search for the Kafka materialization connector

After setting up the Snowflake capture, you’ll see a summary popup.
Click MATERIALIZE COLLECTIONS to begin configuring Kafka as the destination.
Alternatively, go to Destinations in the left menu and click + NEW MATERIALIZATION.

2. Add Kafka as a Destination

In the Search Connectors field, type Kafka.
Select the Kafka Connector and click Materialization.

3. Configure the Kafka Connector

On the Create Materialization page, enter:

Name: A unique identifier for the Kafka materialization.
Bootstrap Servers: The Kafka broker endpoints (e.g., broker1:9092,broker2:9092).
Topic Name: The Kafka topic where Snowflake data will be streamed.
Serialization Format: Choose JSON or Avro for message encoding.
- Note that the Avro format will also require a schema registry.
Authentication Details (if required):
- Auth Type: SASL, TLS, or other supported mechanisms.
- Username & Password (if using SASL authentication).

4. Link Snowflake Data to Kafka

Estuary Flow will automatically map Snowflake data collections to Kafka topics.
If needed, manually link the source collections by selecting SOURCE FROM CAPTURE.

Click NEXT > SAVE AND PUBLISH to start streaming Snowflake CDC to Kafka in real time.

How the Kafka Connector Works in Estuary Flow

The Kafka materialization connector streams new, updated, and deleted Snowflake records to Kafka topics.
Delta updates ensure only changed data is transferred, reducing network and processing costs.
The connector automatically handles retries and error recovery, ensuring data consistency.

Streaming Snowflake data to Kafka in real time no longer needs complex engineering work. With Estuary Flow, you get a fully automated, no-code CDC solution that moves live Snowflake data to Kafka topics effortlessly.

🚀 Ready to simplify your Snowflake-to-Kafka pipeline?
Try Estuary Flow for a faster, smarter, and more scalable way to stream Snowflake CDC to Kafka.

Want to see a manual approach instead? Keep reading for Method 2: DIY Snowflake to Kafka Integration.

Method 2: Manually Streaming Snowflake to Kafka (DIY Approach)

If you prefer a custom-built pipeline without third-party ETL tools, you can manually integrate Snowflake CDC with Kafka. This approach involves:

Capturing Snowflake data changes using Snowflake Streams.
Extracting new records using scheduled queries.
Pushing data to Kafka via Python, AWS Lambda, or Kafka Connect.

While this method provides more control, it also requires significant engineering effort for setup, maintenance, and scalability.

Prerequisites

Before proceeding, ensure you have:

A Snowflake account with admin privileges
A Kafka cluster (self-hosted, AWS MSK, or Confluent Cloud)
Kafka authentication credentials (if using SASL, TLS, etc.)
A compute environment (AWS Lambda, Python, or Java)

Step 1: Enable Snowflake CDC Using Snowflake Streams

1. Create a Table in Snowflake

Or use an existing table.

plaintextCREATE TABLE transactions (
    id INT PRIMARY KEY,
    user_id INT,
    amount DECIMAL(10,2),
    transaction_date TIMESTAMP
);

2. Create a Snowflake Stream

A Stream in Snowflake tracks INSERTS, UPDATES, and DELETES to a table.

plaintextCREATE STREAM transactions_stream 
ON TABLE transactions 
APPEND_ONLY = FALSE;

This stream captures every change in the transactions table.

3. Verify Streamed Changes

Check the Snowflake CDC queue:

plaintext
SELECT * FROM transactions_stream;

Step 2: Extract Data from Snowflake & Format for Kafka

1. Set Up a Scheduled Task to Export Data

Snowflake doesn’t push data automatically, so we need a scheduled task to extract new records.

plaintextCREATE OR REPLACE TASK export_to_kafka
SCHEDULE = '1 MINUTE'
AS
INSERT INTO staging_table
SELECT * FROM transactions_stream;

This task moves CDC changes to a staging table every minute.

Step 3: Load Data from Snowflake to Kafka

At this stage, we need a Kafka Producer to pick up staged data and push it into Kafka. This can be done via:

Python with Kafka-Python
AWS Lambda function
Kafka Connect (JDBC Source Connector)

Option 1: Using Python to Push Data to Kafka

python
from kafka import KafkaProducer
import snowflake.connector
import json
# Connect to Snowflake
conn = snowflake.connector.connect(
    user='your_username',
    password='your_password',
    account='your_account'
)
# Fetch new transactions
cur = conn.cursor()
cur.execute("SELECT * FROM transactions_stream")
rows = cur.fetchall()
# Kafka Producer
producer = KafkaProducer(
    bootstrap_servers=['broker1:9092', 'broker2:9092'],
    value_serializer=lambda v: json.dumps(v).encode('utf-8')
)
# Publish to Kafka
for row in rows:
    producer.send('transactions_topic', row)
print("Data sent to Kafka!")
# Close connections
cur.close()
conn.close()

✅ This script:

Pulls new data from Snowflake Streams
Converts it into JSON format
Publishes records to Kafka topics

Option 2: Using Kafka Connect JDBC Source Connector

Alternatively, you can use Kafka Connect to sync data:

plaintext{
  "name": "snowflake-source-connector",
  "config": {
    "connector.class": "io.confluent.connect.jdbc.JdbcSourceConnector",
    "connection.url": "jdbc:snowflake://your_snowflake_url",
    "connection.user": "your_username",
    "connection.password": "your_password",
    "table.whitelist": "transactions_stream",
    "mode": "incrementing",
    "incrementing.column.name": "id",
    "topic.prefix": "snowflake-"
  }
}

✅ Kafka Connect automates ingestion but still requires manual configuration.

Step 4: Validate Kafka Messages

Once data is pushed to Kafka, you can validate it using the Kafka CLI:

plaintext
kafka-console-consumer --bootstrap-server broker1:9092 --topic transactions_topic --from-beginning

✅ This confirms data is streaming from Snowflake to Kafka successfully.

Key Challenges with the Manual Approach

Not Truly Real-Time – Data updates depend on scheduled queries instead of live event streaming.
High Engineering Effort – Requires SQL, Python, Kafka Connect, and DevOps knowledge.
Complex Debugging – Harder to troubleshoot failures, duplicates, and missing data.
No Built-In Schema Handling – Changes to Snowflake require manual adjustments in Kafka.

Snowflake to Kafka: Manual Approach vs. Automate

Feature	Manual Setup (Snowflake Streams + Scripts)	Estuary Flow (No-Code CDC)
Ease of Use	Complex setup, requires SQL + Python	No-Code UI, set up in minutes
Real-Time CDC	Delayed (Scheduled Queries)	Sub-100ms latency
Schema Evolution	Needs manual adjustments	Auto-detects schema changes
Scalability	Hard to scale	Built for large-scale pipelines
Maintenance	Requires monitoring & debugging	Fully managed & automated
Reliability	Risk of data loss	Built-in error handling

Final Thoughts: The Best Way to Stream Snowflake Data to Kafka

When choosing a Snowflake to Kafka integration, your decision comes down to efficiency, scalability, and ease of use.

🔹 Manual integration (Snowflake Streams + Kafka Connect) offers customization, but it’s complex, time-consuming, and requires ongoing maintenance. You’ll need to write and maintain scripts, troubleshoot issues, and manually handle schema changes—which slows down real-time analytics.

🔹 Estuary Flow, on the other hand, provides a fully automated, no-code CDC solution that moves Snowflake data to Kafka in real time. It eliminates batch processing delays, handles schema evolution seamlessly, and ensures sub-100ms data latency—all without writing a single line of code.

Why struggle with manual setups? Get started with Estuary Flow today and unlock effortless, real-time data streaming from Snowflake to Kafka!

Want to explore more? Check out these related guides:

Share this article

Table of Contents

Start Building For Free

About the author

Jeffrey Richman

With over 15 years in data engineering, a seasoned expert in driving growth for early-stage data companies, focusing on strategies that attract customers and users. Extensive writing provides insights to help companies scale efficiently and effectively in an evolving data landscape.

How to Stream Snowflake Data to Kafka – A Complete Guide

How Does Snowflake CDC to Kafka Work?

Two Ways to Move Data from Snowflake to Kafka

Method 1: Streaming Snowflake to Kafka with Estuary Flow (No-Code Approach)

Why Choose Estuary Flow for Snowflake CDC to Kafka?

Prerequisites

Step 1: Configure Snowflake as a Source in Estuary Flow

1. Log in to Estuary Flow

2. Add Snowflake as a Source

3. Configure the Snowflake Connector

Step 2: Configure Kafka as a Destination in Estuary Flow

1. Navigate to the Destinations Tab

2. Add Kafka as a Destination

3. Configure the Kafka Connector

4. Link Snowflake Data to Kafka

How the Kafka Connector Works in Estuary Flow

Method 2: Manually Streaming Snowflake to Kafka (DIY Approach)

Prerequisites

Step 1: Enable Snowflake CDC Using Snowflake Streams

Step 2: Extract Data from Snowflake & Format for Kafka

Step 3: Load Data from Snowflake to Kafka

Option 1: Using Python to Push Data to Kafka

Option 2: Using Kafka Connect JDBC Source Connector

Step 4: Validate Kafka Messages

Key Challenges with the Manual Approach

Snowflake to Kafka: Manual Approach vs. Automate

Final Thoughts: The Best Way to Stream Snowflake Data to Kafka

Start streaming your data for free

About the author

Popular Articles

ChatGPT for Sales Conversations: Building a Smart Dashboard

Why You Should Reconsider Debezium: Challenges and Alternatives

Don't Use Kafka as a Data Lake. Do This Instead.

Streaming Pipelines.

Simple to Deploy.

Simply Priced.