Businesses generate a vast amount of data through sales transactions, customer interactions, social media platforms, and marketing campaigns. Salesforce plays a crucial role in organizing critical domain data. However, to unleash advanced analytics, seamlessly moving data from Salesforce to BigQuery, a powerful data warehouse, becomes essential. This integration facilitates in-depth analysis, providing valuable insights to enhance decision-making processes and drive business growth.
Let’s understand both platforms and then look into the different methods that will help set up the integration.
What is Salesforce?
Salesforce is a cloud-based, dynamic customer relationship management (CRM) platform. However, it has evolved to support several critical business operations like support and marketing. As a result, today, it helps businesses manage their customer interactions, marketing, sales, and customer service processes. You can use Salesforce to find more prospects, close more deals, and enhance customer service, making it a go-to platform for managing your business operations effectively.
Here are some vital features of Salesforce:
- Analytics and Reporting: Salesforce offers built-in reporting and analytics capabilities to gain insights into sales, marketing, and customer service data. You can create custom reports, dashboards, and visualizations for tracking key metrics, helping make data-driven decisions.
- Security and Data Privacy: Salesforce adheres to industry-standard security practices and compliance regulations, helping protect your data. It provides Salesforce Shield, a trio of data encryption, event monitoring, and audit trails to ensure the confidentiality and integrity of sensitive information.
- Integration Capabilities: Salesforce provides APIs and integration capabilities to connect with other systems and applications. Through these APIs, many third-party integrations are available. These include ETL tools like Estuary, Stitch, and Talend, allowing you to extend its functionality.
What is Google BigQuery?
BigQuery is Google’s serverless, fully-managed cloud data warehouse that enables scalable analysis. For analysis, you can run SQL queries on terabytes of data in seconds and over petabytes of data in only minutes. This is possible mainly because BigQuery separates the compute engine that analyzes your data from the storage. Based on the processing requirement of a query, the compute can be scaled up independently, providing flexibility for varying needs.
Here are some interesting features of BigQuery:
- Columnar Storage: Data in BigQuery is stored in a columnar format, meaning it stores each column separately. This type of storage is optimized for analytical queries of large datasets.
- Scalability: Since BigQuery is designed to handle petabytes of data, you can store and process massive amounts of data. This helps overcome any performance issues or hardware limitations.
- Integrations: You can seamlessly integrate BigQuery with other Google Cloud services, like Google Cloud Dataflow, Google Analytics, and Dataproc. This allows you to easily use BigQuery alongside your existing data infrastructure and workflows.
- Access to Free Version: If you’d like to try out BigQuery features for free and ensure it’s suitable for your business needs, you can use BigQuery Sandbox. You need not provide any billing or credit card details to use Sandbox.
Why Connect Salesforce to Google BigQuery?
While Salesforce and BigQuery are two individual powerful tools, integrating them will result in several benefits. Here are some reasons why migrating Salesforce data to BigQuery is worth considering:
- Advanced Analytics: You can use BigQuery for more advanced analytics beyond the basic reporting and analytical capabilities of Salesforce. It provides powerful data analysis and machine learning workflows to help you uncover valuable insights and make data-driven decisions.
- Data Warehousing: Salesforce is primarily a transactional database that can manage customer interactions. While it preserves limited-time historical data, it isn’t optimized for data analytics. Instead, you can migrate Salesforce data into a purpose-built data warehouse—BigQuery—for centralizing the data for analytics and reporting.
- Integration with Data Visualization Tools: BigQuery seamlessly integrates with popular data visualization and BI tools like Data Studio, Looker, and Power BI. Connecting your Salesforce data stored with BigQuery with these tools helps create interactive dashboards, reports, and visualizations for intuitive insights.
Methods to Connect Salesforce to Google BigQuery
To connect Salesforce to BigQuery, you can use one of these two methods:
- Method #1: Using APIs
- Method #2: Using SaaS Alternatives
Method #1: Use APIs to Move Data From Salesforce to BigQuery
Apart from the many products that Salesforce offers, it also has an abundance of APIs. You can use certain APIs to access and extract data from Salesforce to load into BigQuery. Here are some of the Salesforce APIs you can consider using:
To connect Salesforce to BigQuery using the API method, you must first pull data from Salesforce with one of the APIs. The REST API and SOAP API use different protocols but provide similar functionalities. You can use tools like Postman, CURL, or use HTTP clients for your preferred language or framework to interact with the REST API. The Salesforce REST API uses OAuth 2.0 authentication. After successful authentication with the REST API, you can fetch data from it to load into BigQuery. However, if you’re looking for real-time data, you can use Salesforce’s Streaming API.
Once you’ve extracted the data, you must ensure that it is presented in a supported format before loading it into BigQuery. The two data formats that BigQuery supports for loading data into it are the CSV and JSON formats. If the API you use returns data in any other format, like XML, then you must transform it before loading. You should also ensure that the data types you’re using are supported by BigQuery. You can check the BigQuery Data Types documentation to learn more. After preparing the data, you can use the BigQuery API to create a new BigQuery table or append data to an existing table in your BigQuery dataset.
Method #2: Using SaaS Alternatives
While APIs make it convenient to connect Salesforce to BigQuery, there are some drawbacks associated with using APIs. Implementing an API-based transfer requires writing custom code and managing API maintenance overhead. This adds to the complexity of the integration process, requiring additional efforts.
There’s a better and easier way to connect Salesforce to BigQuery. You can find a range of low-code ETL tools like Estuary Flow to automate real-time data transfer processes while ensuring schema compatibility. Estuary Flow is a powerful ETL tool for your Salesforce to BigQuery connection. Flow automates the real-time transfer of data between a data source and the target using several pre-built real-time connectors. This makes it suitable for monitoring, data processing, and reporting tasks.
Estuary delivers a comprehensive experience for capturing data from Salesforce by using a mixture of different APIs and methods. You can capture historical and real-time updates with two connectors based on different APIs. For capturing historical data, Estuary uses a slightly modified version of Airbyte's Salesforce connector that uses the REST Query API.
And for real-time data, it uses Pushtopic Streaming APIs, which are built for receiving real-time updates.
Let’s look at how you can use Estuary Flow to connect Salesforce to Google BigQuery:
Step 2: Click on Captures on the left side of the Estuary Flow dashboard. Then, click on the New Capture button.
Look for the Salesforce connector by scrolling down or by searching for Salesforce in the Seach Connectors box. Once you’ve located the connector, click on the Capture button.
Notice that there are two Salesforce connectors. The regular Salesforce connector captures data from Salesforce objects into Flow collections using batch processing. On the other hand, the Salesforce Real-Time connector is used for capturing data as soon as it is generated via the Salesforce PushTopic API.
Let’s consider the regular Salesforce connector for this tutorial. You’ll be navigated to the Salesforce Historical Data connector page.
Fill in the required fields on this page, like a connector Name, Start Date, and Filters for your Salesforce Objects. If you’re using a Salesforce Sandbox, tick the box against Sandbox. Now, click on Authenticate Your Salesforce Account. You’ll be prompted to provide your Salesforce user credentials.
Upon completing these requirements, click on Next. Flow connects with your Salesforce account and detects all the data objects available there. When you’re ready, click on Save and Publish.
Instead of the regular Salesforce connector, you can select the Salesforce Real-Time connector to capture data from Salesforce objects in real time. This connector only requires you to authenticate your Salesforce account to start capturing the data.
Step 3: Next, you must set up the destination for your Salesforce data. You can click on Materialize Connections in the pop-up following a successful capture. Or you can navigate to the Estuary dashboard and click on Materializations on the left-side pane. Then, click on the New Materialization button.
Search for the BigQuery connector in Search Connectors. Then, click on the Materialize button.
BigQuery has a few more prerequisites to meet before you can connect to Flow successfully.
Once you’re done with the prerequisite steps, you can proceed with setting up your destination. The BigQuery materialization requires a few details. Fill in the required fields, like a connector Name, Project ID, Service Account JSON credentials, Region, Dataset, and Bucket details.
Once you’ve provided all the details, click on Next. Then, click on Save and Publish.
If you’re keen on more instructions, here’s the Estuary documentation for:
- The Salesforce Source Connector
- The BigQuery Materialization Connector
- Detailed steps to create a Data Flow like this one
- Salesforce CDC: Salesforce Change Data Capture (CDC) is a valuable feature that allows you to track changes in your Salesforce data in real-time.
Salesforce and BigQuery might be two useful and powerful tools individually, but the real benefit is when you combine both. Moving your Salesforce data into BigQuery allows you to leverage BigQuery’s powerful querying capabilities. With the capacity to store and process petabytes of data in minutes, you can gain valuable insights for improved decision-making.
The two methods to transfer Salesforce data to BigQuery are by using APIs or using SaaS alternatives like Estuary. Among the drawbacks of using the API method is that you must watch out for Salesforce API changes. However, with Estuary Flow, the entire data movement process can be streamlined; setting up and starting the data transfer process only takes a few minutes.
You can also use Estuary Flow connectors to set up a connection between different platforms. Register today, and your first pipeline is free!