Estuary

Real-Time Salesforce Integration with Estuary’s New Connector

Integrate Salesforce data in real time with Estuary Flow's new connector. Sync custom fields, handle API limits, and power analytics in Snowflake without code or delays.

Blog post hero image
Share this article

Introduction 

Salesforce is a leading cloud-based customer relationship management (CRM) platform used by businesses to manage sales, service, marketing, and customer engagement. While it offers powerful tools for operational workflows, extracting real-time data from Salesforce has traditionally been complex due to strict API limits, dynamic field behavior, and evolving schemas.

In this guide, you’ll learn how to sync Salesforce data in real time using Estuary Flow’s new Salesforce connector. It solves API limits, schema drift, and delayed syncs—without writing code.

Traditional Challenges in Extracting Real-Time Salesforce Data

Extracting real-time data from Salesforce is difficult due to API limits, missing updates from dynamic fields, and schema changes. These issues can break pipelines and delay data availability. Below are the main challenges in real-time Salesforce data extraction:

  1. Strict API Limits: Salesforce enforces daily API quotas and concurrency limits, making high-frequency data syncs difficult and costly, especially for large datasets or organizations with many automated workflows.
  2. Dynamic Field Behavior: Fields like formulas and roll-ups don’t emit change events, causing traditional ETL tools to miss critical updates unless full record reloads are performed.
  3. Schema Drift and Customization: Salesforce environments are highly customizable, new fields, renamed columns, and deleted objects are common. These changes can cause schema drift, breaking pipelines that rely on fixed schemas and leading to missing data, failed syncs, or incorrect analytics unless handled dynamically.
  4. Slow and resource-intensive historical backfills: Standard Salesforce APIs (like REST or SOAP) require pulling data in small batches or even one record at a time, making it extremely time-consuming for large datasets. Each batch or query consumes API credits, and large backfills can quickly exceed daily API limits, causing throttling or failures.
  5. Complexity of Implementing: Implementing Salesforce Change Data Capture (CDC) requires technical expertise and a solid understanding of Salesforce APIs. While some development teams possess the necessary skills to implement and maintain CDC effectively, it can be a limiting factor for others.

Salesforce Integration Methods

Salesforce integration involves connecting the Salesforce CRM with other systems, applications, or databases. This creates effective data workflows that you can use to automate tasks, synchronize data across platforms, and enhance overall productivity.

You can employ several integration methods to connect Salesforce with other systems. 

  • Standard REST APIs are easy to use and ideal for real-time, on-demand data access. However, they require a custom polling system to track Change Data Capture, can be API-intensive due to strict rate limits, lack native change tracking, and are inefficient for handling large-scale or bulk data operations.
  • Streaming APIsrequires manual configuration of event types (e.g., PushTopics or Change Data Capture) and custom logic to handle subscriptions, and to consume the APIs you need a persistent subscriber system (e.g., CometD, WebSocket listener) running and handling sessions.
  • Pub/Sub APIs: allows clients to publish and subscribe to event messages using a gRPC-based and HTTP/2 protocol, and subscribes to CDC events, but requires complex setup through code/CLI, and has daily event delivery limits.
  • Salesforce Change Data Capture (CDC)tracks record-level changes across many standard objects, but it still lacks support for formula field updates and doesn’t handle historical data.

Introducing Estuary Flow’s New Salesforce Connector

Estuary Flow combines the best of multiple Salesforce integration approaches, and offers the following features:

  • Lightning-Fast Parallel Backfills: Uses Salesforce's Bulk API 2.0 for initial data loads and backfills, enabling significantly faster data transfer rates while preserving REST API call limitsIt supports parallel backfills across multiple tables, dramatically reducing synchronization times and enabling you to get your data pipelines running faster than ever, 
  • Efficient API Usage: Backfills are executed using batch APIs to minimize API credit consumption. Additionally, object discovery is streamlined through the use of an internal list of standard objects.
  • Automated Refresh for Formula Fields: Formula fields in Salesforce are dynamic and often change frequently. The connector automatically refreshes these fields at configurable intervals (default: daily), ensuring your data remains accurate and up to date, without manual intervention, even though Salesforce doesn't track formula field changes in record modification timestamps.
  • Robust Schema Inference for Custom Fields: The new connector uses advanced schema inference to ensure consistent and accurate representation of custom fields in your data pipeline. It offers improved type detection and mapping, capturing all custom Salesforce objects and fields with precision.
  • Requires minimal setup compared to writing and maintaining custom API or CDC listeners.

This upgrade transforms Salesforce into a real-time data source, enabling seamless analytics, reverse ETL, and operational workflows with ease.

Real-World Use Case: Real-Time Salesforce-to-Snowflake Integration

In this example, we will demonstrate how to set up a real-time Salesforce-to-Snowflake integration tailored for sales and marketing operations. The goal is to continuously sync Lead records—including custom fields like lead scores and enterprise flags—from Salesforce into Snowflake. 

This ensures that any updates to Lead data, such as status changes, score recalculations, or new entries, are immediately reflected in Snowflake.

This real-time pipeline empowers marketing teams to optimize campaigns on the fly, enables sales managers to track lead conversion progress in live dashboards, and gives analysts the ability to combine lead data with external signals (e.g., website activity, product usage, or email engagement) for deeper insights and personalization strategies. 

Pre-requisites:

Before getting started, make sure you have access to the following:

  • Salesforce Developer Account – Required to create and manage Lead records, custom fields, and API access.
  • Salesforce user credentials. We recommend creating a dedicated read-only Salesforce user.
  • An Estuary Flow Account 
  • Snowflake Account – Destination for storing and querying real-time Lead data synced from Salesforce.

Step 1: Prepare your Salesforce Environment

To simulate a realistic sales and marketing scenario, begin by populating your Salesforce environment with sample Lead data. As part of this setup, you’ll also create two custom fields: LeadScore__c and Is_Enterprise_Account__c 

These fields allow us to demonstrate how Estuary Flow handles custom fields and formula field updates during real-time sync. For example, LeadScore__c can be used to prioritize leads for marketing campaigns, while Is_Enterprise_Account__c might flag high-value clients for targeted outreach.

  1. ​​Log in to your Salesforce Developer account.
  2. Navigate to: Setup → Object Manager → Lead → Fields & Relationships
  3. Click New Field to create the two fields”
  • LeadScore__c → Type: Number (length: 3, decimals: 0)
  • Is_Enterprise_Account__c → Type: Checkbox

New Custom field for Accounts
New Custom Field for Leads

Next, populate the fields:

Open Developer Console → Debug → Execute Anonymous Window:

javascript
List<Lead> leads = new List<Lead>{ new Lead(FirstName='Alice'LastName='Smith'Company='Acme'Email='alice@acme.com'Phone='1234567890'LeadScore__c=85Is_Enterprise_Account__c=true), new Lead(FirstName='Bob'LastName='Lee'Company='Techify'Email='bob@techify.com'Phone='2345678901'LeadScore__c=45Is_Enterprise_Account__c=false), new Lead(FirstName='Cara'LastName='Jones'Company='OmniCorp'Email='cara@omnicorp.com'Phone='3456789012'LeadScore__c=95Is_Enterprise_Account__c=true) }; insert leads; List<Lead> leads = new List<Lead>();
Salesforce leads UI

Step 2: Set Up Salesforce as the Source in Estuary Flow

Connect your Salesforce environment to Estuary Flow to begin capturing lead data in real time.

Search for Salesforce in the Estuary UI
  • Log in to Estuary Flow and go to the Sources tab.
  • Click + NEW CAPTURE and search for Salesforce.
  • Select the Salesforce connector and click Capture.
Configure Estuary Salesforce connector
  • Fill in the required fields:
    • Name: A unique name for the capture
    • Authentication: Sign into your Salesforce user account to authenticate with OAuth
    • Optional: set the Start Date to limit how much historical data is pulled
  • Choose the Salesforce objects you want to capture (standard or custom). You can deselect any that aren't needed, here we have selected Leads.
  • Click NEXT, then SAVE AND PUBLISH to start the capture.
Salesforce schema in Estuary

Step 3: Set Up Snowflake as the Destination

  • After a successful capture, you can either click on Materialize Collections in the pop-up or go to the Estuary dashboard and click on Destinations on the left-side pane.
  • Click New Materialization to set up the data destination. On the Create Materialization page, search for Snowflake and click on Materialize.
  • Provide the Materialization name and Endpoint config details, such as the Host URL, User, Password, Database, Schema, Warehouse, and Role. Click on Next. 
Configure Snowflake connector
  • The data collections you captured from Salesforce may already be populated. If not, use the Source Collections tool to locate and add them.
  • Finally, click on Save and Publish. After you complete these steps, Estuary Flow will replicate your data from Salesforce to Snowflake in real-time.

Step 4: Query Leads Data in Snowflake

In Snowflake you can find the Leads data and query it:

javascript
SELECT Name, Company, status, LeadScore__c, Is_Enterprise_Account__c, CASE WHEN LeadScore__c > 60 THEN TRUE ELSE FALSE END AS IsQualified__c FROM salesforce_leads.lead ORDER BY CreatedDate DESC;
Query leads in Snowflake

Set up your own Salesforce-to-Snowflake pipeline in minutes with Estuary Flow. Get started for free → or book a demo to see it in action.

Feature Deep Dives: Behind the Scenes of Estuary’s New Salesforce Connector

Estuary’s Salesforce connector supports automated formula field refreshes and optimized backfills using Bulk API 2.0—making real-time sync fast, reliable, and API-efficient.

1. Automatic Formula Field Refresh

Updating a dependent field (e.g., change Status on a Lead from “New” to “Working - Contacted”). Query the Lead_Score__c field in Snowflake to show it updated correctly.

Update leads in Salesforce

When querying in Snowflake:

javascript
SELECT name, status, leadscore__c, systemmodstamp from salesforce_leads.lead order by systemmodstamp desc;
Query new leads immediately in Snowflake

In this example, the following updates were made: Alice Smith’s Lead Status was changed to “Working - Contacted”. Thanks to Estuary’s automatic change data capture (CDC), these changes are reflected in Snowflake within minutes. 

Similarly. formula fields are automatically refreshed on a configurable schedule, allowing only the affected fields to be updated. This ensures your data remains accurate and current—without the overhead of reloading entire records.

2. Optimized API Usage with Bulk API 2.0

One of the standout benefits of Estuary’s new Salesforce connector is its ability to perform large-scale data synchronization with minimal API overhead. By leveraging Salesforce’s Bulk API 2.0, the connector handles high-volume backfills in a resource-efficient manner—crucial for organizations working within strict API quotas or large data sets.

To demonstrate this, we inserted 1,000 Lead records into Salesforce:

javascript
List<Lead> leads = new List<Lead>(); for (Integer i = 100; i < 1000; i++) { leads.add(new Lead( FirstName'Test', LastName'Lead' + i, Company'DemoCorp2', Email'lead' + i + '@demo.com', Phone'1234567890', LeadScore__c = 60 )); } // Set DML options to ignore duplicate rules Database.DMLOptions dmlOpts = new Database.DMLOptions(); dmlOpts.DuplicateRuleHeader.AllowSavetrue; dmlOpts.DuplicateRuleHeader.RunAsCurrentUsertrue; Database.insert(leads, dmlOpts); // This suppresses duplicate detection

Measuring API Efficiency: Before vs. After Backfill

Trigger data flow backfill in Estuary

Before enabling Backfill:

In the API Usage section of the System Overview page in Setup, you can find the API Requests, Last 24 Hours item in the Organization Detail section of the System Overview page in Setup.

API Requests: 2,336 (15,000 max)

Measuing API requests

After enabling backfill:

API requests after backfill

Total API requests recorded: ~354 (15000 max)

Despite syncing thousands of records, the API usage remained remarkably low. This is because you can submit up to 15,000 batches per rolling 24-hour period in Bulk API 2.0, allowing Estuary to efficiently bundle and submit jobs asynchronously. In this case, the 354 API calls included not just the insert operations, but also authentication, schema inference, and metadata discovery.

These enhancements make Estuary Flow suitable for both low-latency use cases and high-volume enterprise scenarios, addressing the toughest Salesforce integration challenges with ease.

Why Backfill Optimization Matters

Enabling backfill via Bulk API 2.0 in Estuary provides several critical advantages:

  • Faster Time to Insight: Optimized backfills reduce sync times from hours to minutes, allowing teams to start analyzing historical data sooner.
  • Parallel Table Syncs: Estuary supports concurrent backfills across multiple objects, dramatically improving throughput and making large-scale syncs feasible.
  • Lower API Usage: Bulk API 2.0 batches records efficiently, minimizing API calls and avoiding rate-limit penalties—critical for organizations operating within tight quotas.
  • Improves Reverse ETL Accuracy: Backfilled historical data allows you to enrich and sync complete customer profiles back into Salesforce for segmentation, targeting, and personalization.
  • Efficient Batching: Bulk API 2.0 allows large datasets to be grouped into batches—up to 150MB per job—significantly reducing the number of API calls required, even when syncing millions of records.

This approach is especially beneficial for enterprises managing large Salesforce datasets or operating within strict API request limits. By reducing sync latency and resource consumption, Estuary’s optimized connector delivers real-time data with enterprise-grade efficiency.

Conclusion

Estuary Flow’s enhanced Salesforce connector bridges the gap between operational CRM data and modern analytics platforms like Snowflake. By supporting parallel backfills, optimizing API consumption with Bulk API 2.0, and automating formula field refreshes, it turns Salesforce into a real-time data source ready for analytics, personalization, and workflow automation. Whether you're enriching leads for targeted outreach or powering dashboards with live CRM metrics, this connector brings agility and scale to your data pipeline—without the complexity.

As organizations increasingly rely on real-time insights to drive business decisions, tools like Estuary Flow ensure your data infrastructure can keep up. With this integration, your teams can spend less time managing pipelines and more time acting on trusted, timely data.

FAQs

    Yes. Estuary Flow’s Salesforce connector enables real-time data capture and sync using change data capture (CDC) plus scheduled formula field refreshes. It supports both historical backfills and low-latency updates.
    Yes. Estuary Flow offers secure OAuth-based authentication for Salesforce and supports private deployments, VPC peering, and BYOC (Bring Your Own Cloud) options. It is SOC 2 compliant and designed with enterprise-grade security controls.
    Estuary Flow handles Salesforce API limits by batching requests and intelligently scheduling data pulls to stay within Salesforce’s daily and per-hour API quotas. It uses the Bulk API where appropriate to reduce the number of requests and implements automatic backoff and retry mechanisms when limits are approached or reached.

Start streaming your data for free

Build a Pipeline
Share this article

Table of Contents

Start Building For Free

About the author

Picture of Ruhee Shrestha
Ruhee Shrestha Technical Writer

Ruhee has a background in Computer Science and Economics and has worked as a Data Engineer for SaaS providing tech startups, where she has automated ETL processes using cutting-edge technologies and migrated data infrastructures to the cloud with AWS/Azure services. She is currently pursuing a Master’s in Business Analytics with a focus on Operations and AI at Worcester Polytechnic Institute.

Related Articles

Popular Articles

Streaming Pipelines.
Simple to Deploy.
Simply Priced.
$0.50/GB of data moved + $.14/connector/hour;
50% less than competing ETL/ELT solutions;
<100ms latency on streaming sinks/sources.