Estuary

Real-Time Data Should Be Central to Every Snowflake AI Architecture

Data drives AI. To be accurate, data must be up to date. But how do you know if it is? Read this article to learn about considerations for the use of AI in the corporate world, and why it makes sense to ensure data in Snowflake is fresh.

Blog post hero image
Share this article

Snowflake positions itself as the AI Data Cloud. It provides a rich set of features aimed at successful deployments of AI in the enterprise. By drilling into detail of what can go wrong, you will learn about some of these features and understand the value of real-time data in Snowflake.

Use of AI in the corporate world

The way in which we use data in the enterprise is changing rapidly. Here are a few trends that your organization may have started to explore:

  • Data retrieval based on natural language queries/questions
  • Workflow automation leveraging AI agents
  • Use of personal digital assistants

These trends increase the importance of a strong data infrastructure with Snowflake at its center. Why? Let's dive in.

Natural language query retrieval

Imagine you are about to go on a three-week vacation. Then you realize the deadline to submit your self-evaluation is the Friday before you return. On top of that, just an hour ago, one of your coworkers asked you a return favor to write a peer evaluation. What are you going to do?

I think it is fair to assume that many of us will resort to their favorite chatbot (ChatGPT, Claude, Gemini, Grok) for help. You feed it accomplishments and Generative AI (GenAI for short) returns a well-written evaluation, free of spelling errors, ready to copy/paste and submit. Enjoy your well-deserved vacation!

Modern business intelligence (BI) tools either provide or are working toward a natural language interface. Instead of writing SQL queries, or running a pre-built report/dashboard, you converse with the BI tool using plain English. For example: “What is this year’s revenue to date?” or "What is the cost price of our current inventory?”

ThoughtSpot has Spotter for natural language queries. PowerBI promotes integration with Copilot. Looker provides Conversational Analytics. And so on. Most knowledge workers increase their productivity when they can have a "conversation" with the data, compared to the classic approach of slicing and dicing data through pre-built reports and dashboards.

Consider the next time you need data for a presentation. When you use the AI interface on your BI tool, will you dig into the accuracy of the data? For how long? What about teams that don’t have data backgrounds? Can they trust the data? What can go wrong?

Workflow automation leveraging agents

Organizations have started taking advantage of workflow automation through AI agents across industries and across domains. Bain recently published survey results indicating AI use cases within companies are moving into production "far more rapidly in just three years than anything seen in previous technology waves."

For example, an agent can perform background research and craft messages to help with sales outreach. Agents attend meetings to take notes, relate meeting topics to internal documents and perform risk analyses, etc.

With the help of agents we can do more. We get better at what we do. It becomes rarer that we forget to follow up. What can go wrong?

Use of personal digital assistants

One of the latest trends in leveraging AI is the use of personal digital assistants such as OpenClaw, Claude Cowork, or Perplexity's Comet browser. In the most extreme case the AI assistant becomes part of your every workflow, fundamentally changing the way you use your computer as a knowledge worker.

Examples:

  • Hey <digital assistant>, I am about to have a meeting with my boss. What did we discuss last time? What are my accomplishments in the past week? Is there any task planned for this week where I could use his/her help?
  • We are about to meet with the decision maker for this important opportunity. Can you summarize the evaluation process to date, and prepare no more than five slides for a 10-minute presentation?
  • Please list any emails requesting my approval. What are the top three items on my to-do list that I must accomplish this week?

As you can see from these examples, the AI assistant is rapidly becoming a resource driving daily tasks and priorities. What can go wrong?

What can go wrong?

As we increase the use of generative AI in our daily work we must keep our eyes wide open on the risks this introduces. Inconsiderate use of AI can lead to embarrassment and fines, reputational damage (and more fines), costly mistakes, and overall slop. In fact, a bad AI deployment strategy can be very expensive.

Accessing the wrong data

Think about your work environment. What do you know about your company or work environment that is not clearly defined? Do you use acronyms? From a data perspective, what are the definitions?

GenAI makes a lot of assumptions if without these it could not provide an answer or complete a task. Even though GenAI improves daily, it is still relatively rare for a chatbot to respond with "I don't know", or to ask questions for clarification, instead of giving you an answer. 

You may have attended classes on the ever-evolving recommendations for prompt engineering: how to interact with GenAI in order to lower the likelihood of an incorrect response/action, or - in its eagerness to answer a question or complete a task - a hallucination.

To illustrate how errors could arise, here are a couple of examples:

Question/QueryWhat can go wrong?
How does this year’s revenue compare to last year’s on the same date?What is the definition of revenue? Is it recurring revenue? Annually recurring?
“Year” likely refers to the fiscal year. How is it defined?
What is the expected spend on AI tokens by the end of the month?What is the baseline? Do you look at the same month last year and extrapolate (grow) from there? Do you consider average daily usage? But what if the month is December?

Regarding access to company data, one of the most important factors in determining whether a question results in the wrong query is the quality of the semantic model, i.e. the description of the data. Where are these descriptions? Is there any duplication? Are descriptions consistent?

In the spring of 2025 Snowflake introduced semantic views to store such definitions, on top of which its own "text to SQL" generation Cortex Analyst runs. Recently, Snowflake unveiled Semantic View Autopilot to dramatically reduce the efforts organizations must put into designing their semantic models. And with the semantic model in Snowflake, any access to its data can take advantage of the definitions.

BI tools that leverage Snowflake's semantic views, including Sigma, Hex, and Tableau, benefit from this feature.

Governance concerns

As you use AI, where does the data go? What is your level of trust that your data won't be exposed? Will the data AI is using be used for future LLM training by your GenAI provider and if so, does your company agree to this? What access does AI have to company data anyway?

Questions like these are governance questions. Generally organizations that are more regulated (e.g. finance, government), have more of their competitive advantage locked up in intellectual property (e.g. high tech manufacturing, pharmaceuticals), or deal with sensitive data (e.g. healthcare) will have more governance concerns.

For the use of GenAI in a corporate setting, there is a vast difference in considerations between the use of a personal, ad-supported subscription to ChatGPT, versus a company-provided system behind the firewall. Cost is of course a major consideration.

Companies using Snowflake build on top of its security features including Role-Based Access Control (RBAC), access policies and audit logging. And with data anyway residing in Snowflake - i.e. a level of trust was established - you could run AI workloads directly in Snowpark either called by external agents, or with agents residing in Snowflake.

Outdated data

It would be an understatement to say that software technology companies have started to pay attention to AI. Everybody wants to capitalize on the AI craze. Is a SaaSpocalypse on the horizon?

Salesforce is pouring $300M into AI tokens in 2026. But Salesforce is also the largest SaaS platform all-in on leveraging AI. Einstein was introduced as early as 2016. Salesforce's annual conference in 2024 centered around the introduction of Agentforce. And in April 2026 Salesforce introduced Headless 360, exposing its platform's core capabilities through API, MCP (Model Context Protocol) and CLI, ready for any agents to use.

A third consideration is whether AI has access to the latest information.

The best way to ensure you access the latest data is of course to directly go to the source. If Salesforce data is all you need then why not use the new Headless 360? In practice, and depending on your role, you use data from multiple systems. Does every system provide agentic access? And even if it did, is it practical to retrieve all data on demand for AI to process? For decades the industry debated whether centralized data access should replace federated access. 

In practice, and depending on your role, you use data from multiple systems. Does every system provide agentic access? And even if it did, is it practical to retrieve all data on demand for AI to process? For decades the industry debated whether centralized data access should replace federated access. 

Does anyone remember the lessons from Bill Inmon ("Building the data warehouse" - 1992) or Ralph Kimball (dimensional modeling in "The Data Warehouse Toolkit" - 1996)? Data volumes matter. Speed of data access matters (e.g. rate limits). Access controls matter. Governance matters.

Of course if an agent has to update data it must access the source.

Snowflake started as a data warehouse platform, naturally storing consolidated data. It evolved into the AI Data Cloud. But is its data up to date? What is the frequency of data updates? Daily? Hourly? How volatile is the data you need, and, well, what can go wrong if data is outdated? 

Imagine an AI agent decides to unnecessarily produce more products only because it is looking at yesterday’s inventory numbers. Or automated prospect outreach does not take into consideration that we just closed a deal with a now-customer?

AI needs up to date business information, and with AI running against data in Snowflake, its data must be up to date, as close to real-time as possible.

Conclusion

For white collar workers, AI is getting into our daily work. Whether we will be able to get the most out of its use depends largely on the extent to which AI can access up to date corporate data. Snowflake is at the forefront of enabling organizations access to data in a trusted and governed environment, with features enabling successful AI adoption. Leverage these for successful AI adoption in your organization.

Start streaming your data for free

Build a Pipeline

About the author

Picture of Mark Van de Wiel
Mark Van de Wiel

Mark Van de Wiel was the Field CTO at Fivetran from 2022 through 2025, where he guided enterprise customers in optimizing their data integration strategies. Mark joined Fivetran in 2021 through the acquisition of HVR, where he led US operations and played a key role in scaling the business. His prior experience includes technical leadership roles at Oracle, Actian and GoldenGate Software. Today, Mark is building software using the latest and greatest AI provides.

Streaming Pipelines.
Simple to Deploy.
Simply Priced.
$0.50/GB of data moved + $.14/connector/hour;
50% less than competing ETL/ELT solutions;
<100ms latency on streaming sinks/sources.