Whether you are a data analyst, a data engineer, a data scientist, or someone who needs to analyze data and generate some value from it, this article will provide insights on how to connect Google Sheets to Snowflake data warehouse and improve your workflows using two methods. 

Organizations rely heavily on data analysis to make informed decisions in today's data-driven world. Data warehousing and analytics platforms like Snowflake have become increasingly popular due to their scalability, ease of use, security, ease of integration, performance benefits, and cost-effectiveness. 

In modern-day data analysis, data can originate from various sources, including social media, IoT sensors, flat files, APIs, CSV files, and different transactional and analytical databases. Sometimes, data may even be in spreadsheet format, hosted on Google Sheets. Regardless of its source, this data has the potential to deliver significant business value to organizations. Therefore, it is crucial to conduct effective analysis and create value from it.

What is Snowflake? 

Snowflake is a cloud-based data warehouse and analytics platform. Snowflake allows you to organize, store and analyze large amounts of data. This data could be both structured and unstructured. 

Snowflake allows you to analyze and generate insights from all your data in one place. Snowflake is widely used and trusted by several organizations because of its…

Scalability

Snowflake’s architecture is highly scalable. Snowflake can scale with your data. Snowflake is highly effective when your data contains tens of rows to when your data contains millions of rows. This helps you store and analyze huge amounts of data in a fast and optimized manner.

Ease of use

Snowflake is very easy to get started with. It is very beginner friendly. It also has an encompassing documentation which makes it easy to learn.

Security

Snowflake is very secure. Snowflake takes security seriously. It provides data security features such as using AES-256 encryption, role-based access control, and audit control. Snowflake also provides data encryption when your data is in transit from one system to another. Hence, you can rest assured that your data is safe.

Seamless Data Integration: Google Sheets to Snowflake via Kafka

Enhance your data integration capabilities with the power of Kafka. Learn how to effortlessly connect Google Sheets to Snowflake and keep your analytics up-to-date. Discover more about this integration in Estuary's technical blog on Kafka to Snowflake.

Methods to connect & load data from google sheets to snowflake

Method 1: Connect Google Sheets to Snowflake Using Python

Method 2: Using Estuary Flow

Prerequisites

To connect your Google sheets to Snowflake, you need the following:

  1. A Google account. You will need a Google account to manage your Google Sheets. You can create a Google account by clicking here.
  2. A Snowflake account. Since Snowflake is the destination, you would need a Snowflake account where you can make your migrations to. You can start with a free trial here.

Data Source

In this tutorial, you will analyze video game sales data. The data can be downloaded from this link. The data has about sixteen thousand records and ten columns. It contains video game sales from different regions in North America, Europe, Japan, and some other parts of the world.

Method 1: Connect Google Sheets to Snowflake Using Python

In this method, you will use Python to load data from Google Sheets to Snowflake. To do this, you will have to enable public access to your Google Sheets. You can do this by going to File>> Share >> Publish to web.

google sheets to snowflake - publish to web.png

After publishing to web, you will see a link in the format of 

https://docs.google.com/spreadsheets/d/{your_google_sheets_id}/edit#gid=0

In order to read this data, convert it to a dataframe and write to Snowflake, you would have to install some libraries. One of which is pandas and the other is snowflake.connector and PyArrow.

You can install pandas by pip install pandas. You can also install Snowflake connector by running the command pip install snowflake-connector-python. You can install pyarrow by executing the command pip install pyarrow.

You can read the data in your Google Sheets by running the following code.

plaintext
import pandas as pd data=pd.read_csv(f'https://docs.google.com/spreadsheets/d/{your_google_sheets_id}/pub?output=csv')

In the code above, you will replace {your_google_sheets_id} with the id from your spreadsheet. You can preview the data by running the command data.head()

google sheets to snowflake - data.head

You can also check out the number of columns and records by running data.shape

Configuring Snowflake credentials

In your Snowflake account, you will have to configure a data warehouse, a database, a schema and a table. To do this, head over to your Snowflake console and run the following commands.

plaintext
CREATE WAREHOUSE VIDEO_WH; CREATE DATABASE VIDEO_DB; CREATE SCHEMA VIDEO_SCHEMA; CREATE USER VIDEO_USER; CREATE TABLE VIDEO_SCHEMA.VIDEO_TABLE( RANK INT PRIMARY KEY , NAME VARCHAR(200), PLATFORM VARCHAR(20), YEAR VARCHAR(20), GENRE VARCHAR(20), PUBLISHER VARCHAR(20), NA_SALES FLOAT, EU_SALES FLOAT, JP_SALES FLOAT, OTHER_SALES FLOAT, GLOBAL_SALES FLOAT )

Loading the data into Snowflake

To load the data into Snowflake in Python, you would have to make use of the Snowflake connector installed earlier. To do this, you can execute the code block below

import snowflake.connector from snowflake.

plaintext
connector.pandas_tools import write_pandas def write_to_snowflake(data):    conn = snowflake.connector.connect(                    user='username',                    password='password',                    account='account',                    warehouse='VIDEO_WH',                    database='VIDEO_DB',                    schema='VIDEO_SCHEMA'                    )    col=[]    for i in data.columns:        col.append(i.upper())    data.columns=col    write_pandas(conn, data, table_name='VIDEO_TABLE', database="VIDEO_DB",schema='VIDEO_SCHEMA')

When you run write_to_snowflake(data), you will ingest all the data into your Snowflake data warehouse. You can preview the data by

plaintext
SELECT * FROM VIDEO_TABLE LIMIT 10

You can also get the number of records ingested by

plaintext
SELECT COUNT(*) FROM VIDEO_TABLE
google sheets to snowflake - select count video table

You can see that all the data was loaded into the Snowflake data warehouse.  You can then schedule your Python jobs to run every day or every hour using an orchestration tool like Apache Airflow or Prefect. 

This method makes your data liable to security threats because the Google Sheet was made public. I will go through another method of using Estuary Flow to connect your Google Sheets to Snowflake.

Method 2: Connect Google Sheets to Snowflake Using Estuary Flow

In this section, you will learn how to use Estuary Flow to connect Google Sheets to Snowflake.

Step 1: Sign up for an Estuary Flow account

To get started with Estuary Flow, you can create an account by using this link. You can either sign up with your GitHub account or your GMail. After logging in, you will see a welcome page as shown in the image below.

google sheets to snowflake - estuary home

Congratulations!! You have successfully created an Estuary Flow account and taken the first step in connecting Google Sheets to Snowflake.

Step 2: Creating a Service Account

Head over to your Google Cloud account and search for IAM and Admin>> Service Accounts. On the top corner, you will see a button which reads “Create Service Account”.

google sheets to snowflake  - service account

After creating the service account, you will have to create a Key. To do this, click on the service account you just created and go to the Keys tab. You then click on Add Key. When this is done, a JSON key or a P12 key will get automatically downloaded to your local computer. This key is very important because it will serve as a way to authorize your Google Sheets with Snowflake.

google sheets to snowflake - save key

After creating a Key, the next thing you need to do is to attach it to your Google Sheets. In your downloaded Key, you will see a client email. Copy that email and share your Google Sheet with that email address.

google sheets to snowflake - share sheet.

Step 3: Creating a Capture

You can now create a capture in the Flow application by clicking on New Capture.

After clicking on New Capture, you then select your source. In this case, your source connector is Google Sheets.

google sheets to snowflake - capture config

Whilst creating your capture, you specify your capture name, capture details, your spreadsheet link and your Service Account Key. Recall, the Service Account Key was downloaded to your local computer from Google.

When you input these credentials, you can test your capture.

google sheets to snowflake - test capture

From the image above, you can see that the test was successful.

Step 4: Create a Materialization

To create a materialization, you can navigate to the Materialization tab in the Estuary Flow UI and click on Materialization. 

You then specify your destination as Snowflake data cloud.

google sheets to snowflake - snowflake connector

When you select your destination, you will have to configure it by specifying the endpoints, snowflake database name, warehouse name, account name, password, and some other configuration credentials.

google sheets to snowflake - materialization config

You will also have to specify your collection.

After specifying your collection, you then test your materialization.

google sheets to snowflake - test materialization

From the image above, your materialization was successful.

Step 5: Verify the results

If you check the materializations, you will notice that data has already been written to Snowflake.

google sheets to snowflake - active materialization

In this section, you will verify the records in Snowflake.

You can preview the data in Snowflake by running the command 

SELECT * FROM SHEET1

google sheets to snowflake - select * from sheet 1

By default, Estuary Flow creates a table corresponding to the name of your spreadsheet. Flow also creates some default tables such as FLOW_CHECKPOINTS and FLOW_MATERIALIZATIONS.

google sheets to snowflake - video schema.

You can also check the number of records ingested by executing the command

SELECT COUNT(*) FROM SHEET1

google sheets to snowflake - select count.

You can see that all the data from your Google Sheets table was migrated to the Snowflake data warehouse.

This method is much more secure than the previous method because the data was not public and only the Google API had access to your spreadsheet.

Conclusion

In this tutorial, you have learned how to connect your Google Sheets to Snowflake using two methods. Using Estuary, you can be rest assured that your data is secured and is free from black hats. You can also be assured that your data migration is secure. 

Need to build another data pipeline? Check out the full list of data systems Flow can connect

Start streaming your data for free

Build a Pipeline