
How to Move Data from Snowflake to Databricks in Real Time
Learn how to move data from Snowflake to Databricks in real time using Estuary. Build dependable right-time pipelines for analytics, AI, and machine learning with no code and continuous data sync between your warehouse and lakehouse.

You can stream data from Snowflake to Databricks in real time using Estuary, a right-time data platform that lets you move data continuously without managing batch pipelines or manual exports. This integration keeps Databricks always up to date with your latest Snowflake tables, powering faster analytics, AI workflows, and machine learning pipelines.
Snowflake is built for scalable SQL analytics, while Databricks excels at advanced processing, streaming, and open data formats like Delta Lake. When connected through Estuary, the two platforms complement each other perfectly, enabling low-latency data sharing, reduced compute costs, and unified access to your most current data.
In this guide, you’ll learn how to set up a reliable Snowflake to Databricks connection in minutes using Estuary, along with key reasons this real-time approach outperforms traditional ETL pipelines.
Key Takeaways
- You can connect Snowflake and Databricks to combine scalable SQL analytics with advanced AI and machine learning workloads.
- Traditional ETL or batch pipelines introduce latency and maintenance overhead, slowing down data-driven workflows.
- Estuary enables right-time data movement, letting you stream Snowflake data to Databricks with sub-second latency.
- No coding or orchestration tools are required — Estuary automatically handles schema evolution, monitoring, and recovery.
- The result: faster insights, real-time model training, and lower compute costs across your warehouse and lakehouse environments.
Snowflake and Databricks: Better Together, Not One vs the Other
At first glance, Snowflake and Databricks might seem like competing platforms. But in reality, they’re built for different strengths—and when used together, they can unlock far more value than either one alone.
Snowflake is a fully managed cloud data warehouse known for its ease of use, native SQL support, and high performance for structured analytics. It’s ideal for dashboards, BI tools, and operational reporting.
Databricks, on the other hand, is a powerful lakehouse platform built on Apache Spark. It shines in machine learning, streaming, and large-scale data processing—especially when working with open file formats like Delta Lake or Parquet. It's also where many teams are running AI workflows, from real-time inference to model training and experimentation.
In a modern data stack, it’s common to see both tools working side by side:
- Product analytics in Snowflake, feature engineering in Databricks
- SQL dashboards in Snowflake, AI model pipelines in Databricks
- Raw event data stored in Snowflake, real-time enrichment, and ML ops in Databricks
But here’s the catch: they don’t sync out of the box. And AI pipelines are only as good as the freshness and completeness of the data they’re trained on.
That’s why moving data from Snowflake to Databricks—continuously, in real time—has become a key part of modern architectures. Instead of treating these systems as silos, the smart move is to connect them and let each do what it does best.
Suggested Read: Databricks vs Snowflake
Why Moving Data from Snowflake to Databricks Is Harder Than It Looks
While both Snowflake and Databricks are powerful in their own right, connecting them is surprisingly difficult. Most teams start with manual or batch-based methods, only to find themselves stuck with brittle workflows that don’t scale.
Here are the common roadblocks:
- No native sync: Snowflake doesn’t offer built-in connectors to stream data directly into Databricks. You’ll need custom pipelines or third-party tools to bridge the gap.
- ETL is complex and slow: Traditional extract-transform-load (ETL) pipelines are often batch-oriented. They introduce hours of latency, which kills real-time use cases like AI-powered recommendations or live dashboards.
- Maintenance overhead: Managing scripts, orchestrators, and schema changes across two evolving platforms becomes a full-time job. One change in Snowflake can break your entire Databricks workflow.
- Data duplication or loss risks: Without exactly-once delivery and schema enforcement, syncing can result in duplicates, partial updates, or broken AI inputs.
- Limited flexibility: Most off-the-shelf ETL tools don’t support custom transformations, streaming updates, or hybrid cloud environments well enough to keep up.
If your team relies on fresh Snowflake data for AI pipelines, ML features, or real-time metrics in Databricks, you can’t afford to wait hours—or rebuild pipelines every time something changes.
This is where Estuary Flow makes a meaningful difference.
How to Move Data from Snowflake to Databricks Using Estuary (Step-by-Step)
Estuary makes it easy to stream data from Snowflake to Databricks — no pipelines to maintain, no custom Spark jobs, no batch scripts.
In this step-by-step guide, you’ll connect Snowflake as your source using CDC and deliver data continuously into Databricks with Delta Lake format. This gives you sub-second latency, automatic schema handling, and a fully managed pipeline.
Let’s walk through the setup:
Step 1: Connect Snowflake as Your Source
- Log in to the Estuary Dashboard. If you don’t have an account yet, create one for free — no credit card required.
- In the left sidebar, click on Sources, then hit the + New Source button.
- From the list of connectors, select Snowflake and click Capture.
- Enter your Snowflake credentials:
- Host: Your Snowflake account URL (e.g.
xy12345.us-east-1.snowflakecomputing.com) - Database and Warehouse: Where your source data lives.
- User and Password: A Snowflake user with appropriate roles (we recommend creating a dedicated
ESTUARY_USER).
- Host: Your Snowflake account URL (e.g.
- Estuary will auto-discover your Snowflake schema. Select one or more tables to sync. Estuary will now capture all inserts, updates, and deletes in real time using CDC.
Step 2: Set Up Databricks as the Destination
- From the dashboard sidebar, go to Destinations and click + New Materialization.
- Select Databricks from the list and click Materialize.
- Fill in your Databricks configuration details:
- Address: Host and port for your SQL warehouse.
- HTTP Path: From your SQL warehouse.
- Catalog Name: Name of your Unity Catalog.
- Personal Access Token: Generate a Personal Access Token in Databricks.
- Link the collections from your Snowflake source to this Databricks materialization. Estuary will ensure schema compatibility.
Step 3: Save and Activate the Pipeline
- Click Save & Publish to activate your pipeline. Estuary begins streaming data from Snowflake to Databricks immediately.
- From the dashboard, you can:
- Monitor sync status and latency in real time
- View row counts and throughput
- Edit schemas and transformations
- Enable logging and error alerts
- Want to transform data in-flight? Use Estuary’s UI for field mappings, or go deeper with SQL and TypeScript derivations.
Need to scale across more tables or use cases? Repeat the same flow. Estuary supports multiple pipelines and horizontal scaling.
You’ve now built a production-ready, real-time data pipeline from Snowflake to Databricks in minutes, with zero code and full observability.
Why Use Estuary to Move Data from Snowflake to Databricks?
Moving data from Snowflake to Databricks might sound simple in theory, but maintaining reliability, low latency, and scalability in practice is another story. Here’s why Estuary is the smart choice for teams looking to bridge the two platforms:
Real-Time Streaming with Sub-Second Latency
Estuary supports Snowflake CDC out-of-the-box. That means you’re not relying on batch jobs or time-consuming DIY implementations — your data flows continuously with latency low enough to power real-time dashboards and analytics in Databricks.
You can also ensure data makes it into Snowflake in real-time in the first place with Estuary’s Snowpipe Streaming integration. Any latency upstream cascades along the pipeline, so a truly real-time solution needs the lowest possible latency at each step of the journey.
No-Code Setup, End-to-End
Traditional approaches require writing custom Spark jobs, managing orchestration tools, or configuring middleware like Kafka. Estuary eliminates that complexity. You configure your pipeline once through the UI, and Estuary handles the rest — from data capture to delivery.
Automatic Schema Management
When your schema changes in Snowflake — a new column, a renamed field, or a changed data type — Estuary Flow can automatically adapt. No broken pipelines, no manual intervention, and no downstream data loss.
Delta Lake Compatibility
Data lands in Databricks in Delta Lake format, which means it’s immediately queryable and ACID-compliant. Whether you're building ML pipelines or interactive dashboards, you can trust your data is fresh and reliable.
Built-In Transformations
Need to reshape or clean your data before it hits Databricks? Estuary supports field mappings, filtering, and derived collections using SQL or TypeScript, right in the pipeline.
Unified Monitoring and Observability
With Estuary’s dashboard, you get full visibility into every pipeline: throughput, latency, sync health, error tracking, and more. No more jumping between tools or building your own observability stack.
Flexible Enough to Fit Any Workflow
Want to mix real-time and batch? Build pipelines across cloud regions? Maintain strict compliance with private deployments? Estuary’s flexible architecture — including support for BYOC — gives you full control without locking you into a rigid model.
Conclusion
Syncing data between Snowflake and Databricks is no longer just about migration. It is about enabling analytics and AI systems to work from the same, freshest version of data. Batch jobs and manual exports cannot meet that standard anymore.
Using a right-time data platform like Estuary, teams can continuously move data from Snowflake to Databricks with sub-second latency, automatic schema handling, and exactly-once reliability. The result is a more efficient architecture where Snowflake remains the foundation for analytics and Databricks becomes the engine for machine learning and large-scale processing.
Whether you are building real-time dashboards, feature pipelines, or unified data models, the key is to control when and how your data moves while balancing performance, cost, and reliability. Estuary makes that balance possible through unified right-time data movement.
Next Steps
- Explore the Estuary Demo: See how right-time data pipelines work in action at.
- Start Your First Integration: Set up your Snowflake to Databricks pipeline in minutes with Estuary’s no-code interface.
- Learn from the Documentation: Explore configuration, delta updates, and advanced transformations.
- Talk to an Expert: Have specific latency, compliance, or architecture needs? Connect with our team.
FAQs
Do I need a Snowflake or Databricks enterprise plan to use Estuary?
Will real-time ingestion increase Snowflake or Databricks costs?
Can I move data from Snowflake to Databricks manually?
Can you use Snowflake and Databricks together?

About the author
Team Estuary is a group of engineers, product experts, and data strategists building the future of real-time and batch data integration. We write to share technical insights, industry trends, and practical guides.

















