
In addition to BigQuery, both Databricks and Snowflake have introduced pipe-style SQL syntax that aims to improve query readability and development experience. While they share a visual resemblance (using pipe-like chaining), their purpose, scope, and design philosophy are quite different.
Introduction to Pipe and Flow Syntax
SQL has historically followed a rigid structure that often conflicts with the logical way humans think through queries. This can make SQL verbose and difficult to read, especially in complex analytics scenarios involving multiple subqueries or transformations. To improve readability and usability, Databricks, Snowflake, and Google BigQuery have introduced alternative syntactic models that reimagine how SQL queries can be written.
Databricks' pipe syntax (|>) allows SQL queries to be written in a linear, transformation-first style, similar to the way DataFrame APIs work in PySpark or Pandas. Snowflake's flow operator (->>) enables chaining of full SQL statements, offering a scripting-like model that maintains SQL's declarative nature while supporting procedural execution. Google BigQuery's pipe syntax (|>) closely resembles Databricks' approach, supporting linear, composable queries with additional operators like EXTEND, SET, and DROP.
Design Philosophies and Mental Models
The key differences among the three syntaxes reflect their underlying design philosophies:
- Databricks |>: Focused on simplifying transformations within a single query block. Inspired by DataFrame-style operations.
- Snowflake ->>: Targets SQL workflow scripting with full control over execution order and intermediate result usage.
- BigQuery |>: Expands on Databricks' model with broader operator support and stronger aliasing capabilities.
Each syntax represents a different mental model for SQL development:
- Databricks and BigQuery support a declarative, transformation-first approach.
- Snowflake supports a procedural, statement-by-statement model.
Databricks Pipe Syntax (|>) – Key Concepts
- Inspired by: DataFrame APIs (e.g., PySpark, Pandas)
- Primary Benefit: Readability and clarity by mimicking logical transformation steps
- Structure Example:
plaintextFROM lineitem
|> WHERE price < 100
|> SELECT price * quantity AS total
|> AGGREGATE SUM(total) AS grand_total GROUP BY country
Operator Highlights:
- AGGREGATE: Combines aggregation and GROUP BY, reducing duplication
- EXTEND: Appends computed columns
- SET, DROP, RENAME: Modify intermediate tables inline
- Can be composed incrementally; perfect for exploratory or iterative analytics
Adoption: Available in Apache Spark 4.0+ and Databricks Runtime 16.2+. Also used in GoogleSQL (BigQuery) and ZetaSQL.
Snowflake Flow Operator (->>) – Key Concepts
- Inspired by: Unix-style piping and scripting logic
- Primary Benefit: Simplified execution of multi-statement scripts
- Structure Example:
plaintextSHOW WAREHOUSES
->> SELECT "name", "state", "size" FROM $1
->> SELECT * FROM $2 WHERE state = 'RUNNING';
Capabilities:
- Supports all SQL statements including DDL, DML, SHOW, etc.
- Uses $1, $2, etc., to reference previous step outputs (only in FROM clauses)
- Ideal for scripting-style logic with sequential dependencies
Limitations:
- No reuse outside the chain
- No positional bind variable support
- Only available within Snowflake Scripting blocks or interactive sessions
Example Use Case Comparison
Goal: Find sales totals per item category, then filter by total sales.
Databricks |>
plaintextFROM sales_data
|> AGGREGATE SUM(amount) AS total_sales GROUP BY category
|> WHERE total_sales > 1000
|> ORDER BY total_sales DESC;
Snowflake ->>
plaintextSELECT category, SUM(amount) AS total_sales FROM sales_data GROUP BY category
->> SELECT * FROM $1 WHERE total_sales > 1000
->> SELECT * FROM $2 ORDER BY total_sales DESC;
BigQuery |>
plaintext-- Compute total sales by item.
FROM mydataset.Produce
|> WHERE sales > 0
|> AGGREGATE SUM(sales) AS total_sales, COUNT(*) AS num_sales
GROUP BY item;
Community Commentary & Critiques
Franck Pachot and other SQL experts have offered nuanced critiques:
Databricks-style Pipe Criticisms:
- Misleads developers into thinking SQL is a streaming language (like Unix pipes)
- Breaks traditional declarative abstraction of SQL
- Encourages bad habits like omitting table aliases
- May confuse optimizer execution vs. query author intent
Snowflake-style Flow Criticisms:
- Less intuitive for simple, one-off queries
- Verbose referencing via $n can get messy in long chains
- Doesn’t solve SQL verbosity; more about structuring large scripts
Databricks’ pipe syntax enhances in-query readability; Snowflake’s flow operator supports cross-statement orchestration. They address different problems and are not functionally interchangeable.
GoogleSQL and BigQuery Context
- GoogleSQL pipe syntax (|>) closely mirrors Databricks’ structure
- Used in BigQuery, Spanner, Procella, and ZetaSQL
- Adds additional operators like EXTEND, SET, DROP, RENAME
- Prioritizes developer ergonomics and auto-completion friendliness
- Considered part of a broader shift toward linear, expressive SQL construction
Example:
plaintext-- Compute total sales by item.
FROM mydataset.Produce
|> WHERE sales > 0
|> AGGREGATE SUM(sales) AS total_sales, COUNT(*) AS num_sales
GROUP BY item;
When to Use Which
Use Databricks Pipe Syntax if:
- You are building complex SELECT queries with nested logic.
- You prefer a linear, DataFrame-like mental model.
- You need improved readability within query blocks.
Use Snowflake Flow Operator if:
- You need to orchestrate a series of dependent SQL statements.
- Your use case involves DDL, DML, or administrative commands.
- You want to script a multi-step workflow within SQL.
Use BigQuery Pipe Syntax if:
- You are writing ad-hoc or exploratory analytics queries.
- You value stepwise readability and rich transformation operators.
- You work within the GoogleSQL ecosystem and want consistent patterns.
Final Thoughts
The introduction of pipe and flow syntax across Databricks, Snowflake, and BigQuery reflects a broader shift in SQL usability and expressiveness. Each syntax extension targets a different set of developer needs: Databricks and BigQuery improve how we write and maintain single queries, while Snowflake enhances how we orchestrate SQL workflows.
Understanding the distinct capabilities of each syntax will help teams adopt the right approach for their use cases. Whether you're optimizing for readability, reusability, or procedural clarity, modern SQL has a syntax to match.
FAQs
1. Can I mix pipe or flow syntax with traditional SQL in the same query?
2. Is there any performance difference when using pipe or flow syntax instead of traditional SQL?
3. Which syntax should I choose for building complex SQL workflows?

About the author
Dani is a data professional with a rich background in data engineering and real-time data platforms. At Estuary, Daniel focuses on promoting cutting-edge streaming solutions, helping to bridge the gap between technical innovation and developer adoption. With deep expertise in cloud-native and streaming technologies, Dani has successfully supported startups and enterprises in building robust data solutions.
