Organizations often use HubSpot to manage their marketing, sales, operations, and customer support. However, all this data provides better insights when combined to create a single source of truth. When you move the data from HubSpot into a data warehouse like BigQuery, you can leverage perform data analysis for making informed decisions. This article will explain how to move data from HubSpot to BigQuery.

But let’s first understand HubSpot and BigQuery in detail.

What is HubSpot?

hubspot to bigquery - hubspot logo

HubSpot is a cloud-based platform to help organizations streamline sales, marketing, support, and more. The HubSpot platform encompasses five hubs—CRM, Marketing, CMS, Operations, and Service—each targeting a different aspect of your business. 

Features of HubSpot Hubs

  • HubSpot Marketing hub: Helps you collect leads, automate marketing campaigns, and more.
  • HubSpot Sales hub: Enables you to automate sales pipelines, get 360o view of contacts, and more.
  • HubSpot Service hub: Allows you to add a knowledge base, collect customer feedback, and generate support tickets.
  • HubSpot CMS hub: Assist you in building websites with a drag-and-drop editor, optimize SEO strategies, and more.
  • HubSpot Operations hub: Helps you sync data across different platforms, improve data quality, and more.

What is BigQuery?

hubspot to bigquery - bigquery logo

BigQuery is Google’s fully-managed, serverless, cloud-based enterprise data warehouse. BigQuery is built on Dremel technology. Dremel turns SQL queries into execution trees. This tree architecture model makes for easy and efficient querying and aggregating results. 

BigQuery supports columnar storage to provide superior analytics query performance. For analytics, you can either run simple SQL queries or connect BigQuery with business intelligence tools.

BigQuery is built to manage petabyte-scale analytics. Hence, it can collect vast amounts of data from different sources to support your big data analytics. 

Features of BigQuery

  • BigQuery Data Transfer: The BQ data transfer service automates data movement into BigQuery regularly based on the defined schedules.
  • BigQuery ML: You can use simple SQL queries to create and execute ML models in BigQuery.
  • BigQuery BI Engine: The built-in query engine processes large queries in seconds.
  • BigQuery GIS: The Geographic Information Systems (GIS) feature provides information about location and mapping through geospatial analysis.

Why Move Data From HubSpot to BigQuery?

The primary reasons to move data from HubSpot to BigQuery include:

Data Analytics: When you move HubSpot data to BigQuery tables, you can create a single source of truth for analytics requirements. It helps you generate insights into your customer and business operations data for better decision-making.

Scalability: BigQuery can handle massive amounts of data. As your business generates more data, you can easily accommodate it on BigQuery. 

Security and Compliance: BigQuery, built on Google Cloud, offers security and compliance features, like auditing, access controls, and data encryption. When you move your data to BigQuery, it’s stored securely and in compliance with regulations.

Cost Savings: BigQuery offers a flexible pricing model. You only need to pay for the amount of data and processing power you use. This results in significant cost savings than when maintaining your own on-premise data warehouse.

Here’s a list of the data you can move to backup HubSpot to BigQuery:

  • Activity data (clicks, views, opens, URL redirects, etc.)
  • Calls-to-action (CTA) analytics
  • Contact lists
  • CRM data
  • Customer feedback
  • Form submission data
  • Marketing emails
  • Sales data

Moving Data From HubSpot and BigQuery

Image Source

Now that you know about both platforms and their essential features, let’s look at how you can move data from HubSpot to BigQuery. You can use the following methods for this purpose:

  • HubSpot Private App
  • Manually through CSV files
  • Custom integration
  • SaaS alternatives

Prerequisites for HubSpot Private App

Before you get started with moving your HubSpot data to BigQuery, you must ensure the following prerequisites:

  • A billing-enabled Google Cloud Platform account.
  • A HubSpot account with admin access.
  • Install Google Cloud SDK in your CLI.
  • Configure the Google Cloud SDK to point to your Google Cloud project.
  • Enable the BigQuery API on your Google Cloud project.
  • bigquery.admin permissions to load data into tables of your BigQuery dataset.

Method #1: HubSpot Private App

Here’s the step-by-step guide to moving data from your HubSpot to BigQuery manually:

Create a Private App

Earlier, internal integrations used a HubSpot API key that provided read and write access to all of your HubSpot CRM data. However, HubSpot API keys were discontinued from being used for authentication to access HubSpot APIs.

With HubSpot API key migration made compulsory, all API keys got deactivated. All existing API key integrations were migrated to private apps. Now, you can use HubSpot’s APIs through private apps to access specific data from your HubSpot account. 

Once you’ve authorized what each private app can request or change in your account, a unique access token gets generated for your app.

  1. Log in to your HubSpot account. Click on the Settings icon in the main navigation bar.
  2. From the left sidebar menu, navigate to Integrations → Private Apps.
  3. Click on Create a private app.
  4. On the Basic Info tab, provide basic app details:

    1. Enter an app name or click on Generate a new random name.
    2. Get your pointer over the logo placeholder and click on the upload icon. You can upload a square logo unique to your app.
    3. Enter your app’s description.
  5. Click on the Scopes tab.
  6. For each scope you’d like to enable for your private app’s access, select the Read or Write checkbox. To search for a specific scope, go to the Find a Scope search bar.
  7. Upon completion of app configuration, click on the Create app at the top right.
  8. A dialog box will appear. Review the information about the access token, then click on Continue creating.

After creating your app, you can use the app’s access token to make API calls. Suppose you want to edit your app’s info or change the scopes, click on Edit details.

Make API Calls With Your App’s Access Token

You can make API calls to HubSpot with the API-base domain. HubSpot’s APIs return JSON-format data.

All HubSpot API calls require a parameter specifying an OAuth access token. Some API calls provide optional parameters for tailoring the information returned.

To start making API calls, navigate to the Details page of your app.

  • Click on Show token within the Access token card. This will reveal your entire token. Click on Copy to copy the token to your clipboard.
  • You can provide the access token to your developers or use it yourself to develop your app. To make a call to a HubSpot API endpoint, set the value of the Authorization field to Bearer [YOUR_TOKEN].
  • To get all the contacts created in your HubSpot account, you must use Node.js and axios. The request will look like this:
plaintext
axios.get('https://api.hubapi.com/crm/v3/objects/contacts',  {    headers: {      'Authorization': `Bearer ${YOUR_TOKEN}`,      'Content-Type': 'application/json'    }  },  (err, data) => {    // Handle the API response  } );
  • The private app access tokens are implemented on OAuth. You can also make authenticated calls with your private app access token using any HubSpot client library. If you’re using the Node.js client library, pass your app’s access token to instantiate an OAuth client.
plaintext
const hubspotClient = new hubspot.Client({ accessToken: YOUR_ACCESS_TOKEN });

Create a Dataset and a BigQuery Table

Read the Quickstart guide for bq tool to learn how to use the bq command. Run the following command from your Google Cloud command line:

bq mk hs_data

Create an Empty Table

Run the following command from your Google Cloud command line:

plaintext
bq mk --table --expiration 86400 --description "Contacts table" --label organization:development hs_data.hs_contacts_table

You can upload binary data in AVRO, CSV, ORC, PARQUET, or NEWLINE_DELIMITED_JSON using this BigQuery API.

Load Data Into the BigQuery Table From Your Local Machine

The bq load command will upload data to your dataset, and define schema and data type information. You might have to iterate through the process multiple times to load all your tables into BigQuery.

To load a newline_delimited JSON file contacts_data.json from your local machine into the hs_data’s hs_contacts_table, run the following command:

plaintext
bq load --source_format=NEWLINE_DELIMITED_JSON hs_data.hs_contacts_table ./contacts_data.json ./contacts_schema.json

Since you’re loading local machine files, you must explicitly specify the data format. You can define the contacts schema in the local schema file contacts_schema.json.

It’s best to move HubSpot data at least once or even multiple times a day.

Append to a Table Using a Local File

If you want to load additional data into your BigQuery table from any source files in your local machine, you can run the following command:

plaintext
bq load    --autodetect    --replace    --source_format=NEWLINE_DELIMITED_JSON    hs_data.hs_contacts_table    ./contacts_data.json

This command will load data from the file named contacts_data.json and appends it to the hs_contacts_table in hs_data. The contacts_schema.json defines the schema.

Schedule Recurring Load Jobs

This is a batch workflow that you must schedule so it'll be run on a regular basis. To schedule recurrent loads into a BigQuery table, you can follow these steps:

  • Create a bash shell script

$ sudo mkdir /bin/scripts/ && touch /bin/scripts/backup.sh

Next, add this content to the file.

#!/bin/bash     bq load      --autodetect      --replace      --source_format=NEWLINE_DELIMITED_JSON      hs_data.hs_contacts_table      ./contacts_ data.json

  • From the command line interface, edit crontab.

$ crontab -e

  • You’ll be prompted to edit a file for introducing tasks to be run by cron. You can schedule cron to execute a job at 6 pm daily.

0 18 * * * /bin/scripts/backup.sh

  • Navigate to the directory location of the backup.sh file and make the file executable.

$ chmod +x /bin/scripts/backup.sh

  • Cron will execute the backup.sh file at 6 pm daily.

Method #2: Manual CSV Files

Among the easier options to move HubSpot data to BigQuery is to download and upload CSV files manually. 

You have the option to Import & Export data from your HubSpot account. You can export the following records:

  • Calls
  • Contacts
  • Companies
  • Deals
  • Payments
  • Tickets
  • Custom objects

When you export any records, you can select the file format (XLS, XLSX, CSV). You can also Include only properties in the view or include all properties on records. Upon exporting, you’ll receive an email with a download link to your export file. These download links expire after 30 days.

Once you’ve downloaded the file, you can move it to your Google BigQuery platform.

Pros of moving data from HubSpot to BigQuery with manual CSV files

  • It’s a relatively simple method.
  • There is no entry barrier, and you can do it quickly.

Cons of moving data from HubSpot to BigQuery with manual CSV files

  • Every time there’s an update, you must manually download and move.
  • There is no continuous data flow.

Method #3: Custom Integrations

If your team is up to the task, you can build your own in-house data pipeline to move data from HubSpot to BigQuery. However, this involves spending a considerable amount of time to get the integration up and running. You also need to spend time, resources, and money to manage and maintain data pipelines.

HubSpot and BigQuery constantly change or update their APIs. This might translate into upstream or downstream failures unless properly addressed.

The custom integration option is worthwhile if your data needs are very specific to your business. However, this option is not a scalable process for syncing data to additional SaaS apps. It will require spending more time to build or maintain data pipelines. You must also take care of data security with custom integrations.

Method #4: SaaS Alternatives

While the methods mentioned above work well, the most efficient option in terms of saving time, resources, and costs is SaaS alternatives.

Some of the problems involved in the previous methods include:

  • Time-consuming: Each of the methods mentioned above is time-consuming. Whether it’s the manual method, exporting to CSV, XLS, or XLSX, or building a custom integration.
  • Latency: The problem with using manual CSV files is that any updates need to be manually moved. Every time your source data changes, you’ll have to re-export the CSVs to get real-time data.

The SaaS solution we'll focus on here is Estuary Flow.

You can use Flow to extract data from your HubSpot account and write the data to a range of destinations, including Google BigQuery. Once you deploy the pipeline, it continually operates in real time. The data moves from HubSpot to BigQuery in real time, keeping track of any changes or updates.

Here’s a guide on how you can use Estuary Flow for moving data from Hubspot to BigQuery:

Step 1: Capture the data from your source

Sign in to your Estuary account. If you don’t have one, sign up for free.

hubspot to bigquery - flow home page

Click on Captures. Next, click on the New Capture button.

hubspot to bigquery - new capture

Once you’re on the Captures page, you must enter your source database in the Search Connectors search bar. In this case, it will be Hubspot. After you’ve located the desired source database, click Capture. You’ll be redirected to the selected database’s Capture page. Choose a start data for your capture, and click Authenticate Your Hubspot Account to allow Hubspot and Flow to connect.

Blog Post Image

Click on Next. Flow connects to your Hubspot account and detects all the data resources available there. When you're ready, click on Save and Publish.

Step 2: Set up the data’s destination

There are two ways to set up your data’s destination. You can either click on Materialize Connections in the pop-up following a successful capture, or navigate to the Estuary dashboard and click on Materializations on the left-side pane. Then, click New Materialization.

hubspot to bigquery - new materialization

In this case, BigQuery will be the materialization option to select. 

BigQuery has a few more prerequisites you’ll need to meet before you can connect to Flow successfully. So before you continue, follow the steps here. 

Once that’s done, you can set up your destination as easily as you did with the source. As you did with Hubspot, fill in the required fields and click Next. Use the Collection Selector to add the data captured from Hubspot, if it wasn’t filled in automatically. Then, click Save and Publish. 

For a more detailed set of instructions, see the Estuary documentation on:

Conclusion

HubSpot is a vital component of the tech stack for several businesses. With its many benefits, it empowers you to improve customer relationships and hone your communication strategies. However, when you extract and combine your HubSpot data with other data sources, the business growth potential increases exponentially. Moving your data from HubSpot to BigQuery allows you to reap the benefits of a single source of truth. This, in turn, boosts the growth of your business.

Here are some other interesting reads to move data to BigQuery:

  1. How to move data from MongoDB to BigQuery in Minutes
  2. How to move data from PostgreSQL to BigQuery in Minutes

Start streaming your data for free

Build a Pipeline