
Snowflake and Databricks are both leaders in the modern data stack, but for different reasons. Snowflake is trusted for scalable, SQL-based analytics. Databricks excels at machine learning, streaming workloads, and open-format flexibility through Delta Lake.
Many teams now use both platforms together. But here's the challenge: how do you move or sync data from Snowflake to Databricks efficiently, without relying on batch pipelines or clunky export workflows?
Whether you're looking to offload compute, train ML models on fresh data, or shift toward a lakehouse architecture, real-time data movement between Snowflake and Databricks unlocks serious performance and cost advantages.
In this guide, you'll learn how to stream Snowflake data to Databricks using Estuary Flow—a fully managed platform that lets you build no-code, real-time pipelines with minimal overhead. No Kafka. No Airflow. No batch jobs. Just fast, reliable sync that scales with your data.
If you're ready to move beyond manual ETL and start connecting your warehouse to your lakehouse in real time, this guide will show you exactly how to do it.
Snowflake and Databricks: Better Together, Not One vs the Other
At first glance, Snowflake and Databricks might seem like competing platforms. But in reality, they’re built for different strengths—and when used together, they can unlock far more value than either one alone.
Snowflake is a fully managed cloud data warehouse known for its ease of use, native SQL support, and high performance for structured analytics. It’s ideal for dashboards, BI tools, and operational reporting.
Databricks, on the other hand, is a powerful lakehouse platform built on Apache Spark. It shines in machine learning, streaming, and large-scale data processing—especially when working with open file formats like Delta Lake or Parquet. It's also where many teams are running AI workflows, from real-time inference to model training and experimentation.
In a modern data stack, it’s common to see both tools working side by side:
- Product analytics in Snowflake, feature engineering in Databricks
- SQL dashboards in Snowflake, AI model pipelines in Databricks
- Raw event data stored in Snowflake, real-time enrichment, and ML ops in Databricks
But here’s the catch: they don’t sync out of the box. And AI pipelines are only as good as the freshness and completeness of the data they’re trained on.
That’s why moving data from Snowflake to Databricks—continuously, in real time—has become a key part of modern architectures. Instead of treating these systems as silos, the smart move is to connect them and let each do what it does best.
Suggested Read: Databricks vs Snowflake
Why Moving Data from Snowflake to Databricks Is Harder Than It Looks
While both Snowflake and Databricks are powerful in their own right, connecting them is surprisingly difficult. Most teams start with manual or batch-based methods, only to find themselves stuck with brittle workflows that don’t scale.
Here are the common roadblocks:
- No native sync: Snowflake doesn’t offer built-in connectors to stream data directly into Databricks. You’ll need custom pipelines or third-party tools to bridge the gap.
- ETL is complex and slow: Traditional extract-transform-load (ETL) pipelines are often batch-oriented. They introduce hours of latency, which kills real-time use cases like AI-powered recommendations or live dashboards.
- Maintenance overhead: Managing scripts, orchestrators, and schema changes across two evolving platforms becomes a full-time job. One change in Snowflake can break your entire Databricks workflow.
- Data duplication or loss risks: Without exactly-once delivery and schema enforcement, syncing can result in duplicates, partial updates, or broken AI inputs.
- Limited flexibility: Most off-the-shelf ETL tools don’t support custom transformations, streaming updates, or hybrid cloud environments well enough to keep up.
If your team relies on fresh Snowflake data for AI pipelines, ML features, or real-time metrics in Databricks, you can’t afford to wait hours—or rebuild pipelines every time something changes.
This is where Estuary Flow makes a meaningful difference.
How to Move Data from Snowflake to Databricks Using Estuary (Step-by-Step)
Estuary Flow makes it easy to stream data from Snowflake to Databricks — no pipelines to maintain, no custom Spark jobs, no batch scripts.
In this step-by-step guide, you’ll connect Snowflake as your source using CDC and deliver data continuously into Databricks with Delta Lake format. This gives you sub-second latency, automatic schema handling, and a fully managed pipeline.
Let’s walk through the setup:
Step 1: Connect Snowflake as Your Source
- Log in to the Estuary Dashboard. If you don’t have an account yet, create one for free — no credit card required.
- In the left sidebar, click on Sources, then hit the + New Source button.
- From the list of connectors, select Snowflake and click Capture.
- Enter your Snowflake credentials:
- Host: Your Snowflake account URL (e.g.
xy12345.us-east-1.snowflakecomputing.com
) - Database and Warehouse: Where your source data lives.
- User and Password: A Snowflake user with appropriate roles (we recommend creating a dedicated
ESTUARY_USER
).
- Host: Your Snowflake account URL (e.g.
- Estuary will auto-discover your Snowflake schema. Select one or more tables to sync. Estuary will now capture all inserts, updates, and deletes in real time using CDC.
Step 2: Set Up Databricks as the Destination
- From the dashboard sidebar, go to Destinations and click + New Materialization.
- Select Databricks from the list and click Materialize.
- Fill in your Databricks configuration details:
- Address: Host and port for your SQL warehouse.
- HTTP Path: From your SQL warehouse.
- Catalog Name: Name of your Unity Catalog.
- Personal Access Token: Generate a Personal Access Token in Databricks.
- Link the collections from your Snowflake source to this Databricks materialization. Estuary will ensure schema compatibility.
Step 3: Save and Activate the Pipeline
- Click Save & Publish to activate your pipeline. Estuary begins streaming data from Snowflake to Databricks immediately.
- From the dashboard, you can:
- Monitor sync status and latency in real time
- View row counts and throughput
- Edit schemas and transformations
- Enable logging and error alerts
- Want to transform data in-flight? Use Estuary’s UI for field mappings, or go deeper with SQL and TypeScript derivations.
Need to scale across more tables or use cases? Repeat the same flow. Estuary supports multiple pipelines and horizontal scaling.
You’ve now built a production-ready, real-time data pipeline from Snowflake to Databricks in minutes, with zero code and full observability.
Why Use Estuary to Move Data from Snowflake to Databricks?
Moving data from Snowflake to Databricks might sound simple in theory, but maintaining reliability, low latency, and scalability in practice is another story. Here’s why Estuary is the smart choice for teams looking to bridge the two platforms:
Real-Time Streaming with Sub-Second Latency
Estuary supports Snowflake CDC out-of-the-box. That means you’re not relying on batch jobs or time-consuming DIY implementations — your data flows continuously with latency low enough to power real-time dashboards and analytics in Databricks.
You can also ensure data makes it into Snowflake in real-time in the first place with Estuary’s Snowpipe Streaming integration. Any latency upstream cascades along the pipeline, so a truly real-time solution needs the lowest possible latency at each step of the journey.
No-Code Setup, End-to-End
Traditional approaches require writing custom Spark jobs, managing orchestration tools, or configuring middleware like Kafka. Estuary eliminates that complexity. You configure your pipeline once through the UI, and Estuary handles the rest — from data capture to delivery.
Automatic Schema Management
When your schema changes in Snowflake — a new column, a renamed field, or a changed data type — Estuary Flow can automatically adapt. No broken pipelines, no manual intervention, and no downstream data loss.
Delta Lake Compatibility
Data lands in Databricks in Delta Lake format, which means it’s immediately queryable and ACID-compliant. Whether you're building ML pipelines or interactive dashboards, you can trust your data is fresh and reliable.
Built-In Transformations
Need to reshape or clean your data before it hits Databricks? Estuary supports field mappings, filtering, and derived collections using SQL or TypeScript, right in the pipeline.
Unified Monitoring and Observability
With Estuary’s dashboard, you get full visibility into every pipeline: throughput, latency, sync health, error tracking, and more. No more jumping between tools or building your own observability stack.
Flexible Enough to Fit Any Workflow
Want to mix real-time and batch? Build pipelines across cloud regions? Maintain strict compliance with private deployments? Estuary’s flexible architecture — including support for BYOC — gives you full control without locking you into a rigid model.
Conclusion
Moving data from Snowflake to Databricks doesn’t have to mean maintaining Spark jobs, stitching together batch scripts, or sacrificing freshness for simplicity. With Estuary Flow, you get a real-time, production-ready pipeline that combines the best of Snowflake’s SQL savvy and Databricks’ scalable analytics on Delta Lake — all with zero manual maintenance.
Whether you're powering live dashboards, feeding machine learning models, or unifying data across teams, Estuary ensures low-latency syncs, automatic schema handling, and exactly-once guarantees — out of the box.
Next Steps: Start Streaming from Snowflake to Databricks
If you're ready to move Snowflake data into Databricks for faster analytics and AI-powered insights, Estuary Flow makes it effortless.
- Create your Estuary Flow account
Set up your first Snowflake to Databricks pipeline in minutes — no Spark jobs, no batch scripts. Get started with Flow - Explore step-by-step tutorials
Learn how Flow works, how to configure delta updates, and how to optimize syncs for Delta Lake. View documentation - Join the Estuary Slack community
Connect with other data engineers, ask questions, and get real-time support from the Estuary team. Join our Slack - Talk to us about your data architecture
Need help with secure deployment, private networking, or choosing the right ingestion method? We’re here to help. Contact Estuary
FAQs
1. What does Estuary do that tools like Fivetran or Airbyte don’t?
2. Do I need a Snowflake or Databricks enterprise plan to use Estuary?
3. Will real-time ingestion increase Snowflake or Databricks costs?

About the author
Team Estuary is a group of engineers, product experts, and data strategists building the future of real-time and batch data integration. We write to share technical insights, industry trends, and practical guides.
