8 min read

February 26, 2025

Apache Iceberg Copy-On-Write (COW) vs Merge-On-Read (MOR): A Deep Dive

Learn the difference between Iceberg Copy-On-Write (COW) and Merge-On-Read (MOR), with PySpark examples, performance impacts, and best practices.

Dani Pálma Head of Data & Marketing

Share this article

Apache Iceberg is widely adopted for its ability to manage large-scale datasets efficiently while supporting schema evolution and ACID transactions. However, when configuring Iceberg tables, data engineers often face the crucial decision of whether to use Copy-On-Write (COW) or Merge-On-Read (MOR) as their update strategy.

This post explores the fundamental differences between these approaches, with practical PySpark examples demonstrating their impact on query performance, update efficiency, and storage behavior.

If you're looking to get started with Apache Iceberg, check out this article.

Apache Iceberg COW vs MOR: What’s the Difference?

Feature	Copy-On-Write (COW)	Merge-On-Read (MOR)
Write Strategy	Rewrites entire affected files	Creates separate delete/update files
Read Performance	Fast, as data is fully compacted	Slower, as delete files must be applied
Write Performance	Slow, due to full file rewrites	Faster, as only small delete/update files are written
Storage Impact	Higher, due to frequent file rewrites	Lower, as full files are not rewritten
Use Case Suitability	Workloads with frequent reads and infrequent updates	Workloads with frequent updates and fewer reads
Update Mechanism	Full file rewrite on update	Newly updated rows written separately, old data marked as deleted
Delete Mechanism	Full file rewrite on delete	Deletes tracked in separate delete files
Compaction Requirement	Not required frequently	Required periodically to merge delete files
Best For	Read-heavy workloads, analytics, bulk updates	Update-heavy workloads, streaming ingestion, incremental updates

Let's dive deeper into each approach, including real-world PySpark examples.

1. Copy-On-Write (COW) in Apache Iceberg

How Copy-On-Write Works

When a write, update, or delete operation occurs, the affected data files are completely rewritten with the new changes.
This means no additional delete files are created.
Reads are highly optimized since the data is fully compacted.

Pros:

Optimized for frequent reads due to fewer fragmented files.
Efficient for bulk updates where many records are changed at once.
Simple file layout as each rewrite produces a clean version.

Cons:

Slow updates and deletes due to file rewriting overhead.
High write amplification, leading to increased storage usage and I/O costs.

PySpark Example: Creating a COW Table

python
prod_db_nm = "analytics_db"
file_dir = "/tmp/warehouse"
prod_db_dir = "analytics"

spark.sql(f"""
    CREATE TABLE IF NOT EXISTS iceberg_catalog.{prod_db_nm}.customer_cow (
        customer_id BIGINT,
        name STRING,
        age INT,
        signup_date DATE
    )
    TBLPROPERTIES (
        'write.format.default' = 'parquet',
        'write.delete.mode' = 'copy-on-write',
        'write.update.mode' = 'copy-on-write',
        'write.merge.mode' = 'copy-on-write',
        'format-version' = '2'
    )
    LOCATION '{file_dir}/{prod_db_dir}/customer_cow'
""")

Inserting Data into the Table

python
spark.sql(f"""
    INSERT INTO iceberg_catalog.{prod_db_nm}.customer_cow
    VALUES (1, 'Alice', 30, '2024-01-10'),
           (2, 'Bob', 27, '2024-02-15'),
           (3, 'Charlie', 35, '2024-03-20')
""")

Expected Output:

python+-------------+
|num_affected |
+-------------+
|           3 |
+-------------+

Updating a Record in COW Table

Every update rewrites the entire affected file.

python
spark.sql(f"""
    UPDATE iceberg_catalog.{prod_db_nm}.customer_cow
    SET age = 31 
    WHERE customer_id = 1
""")

Expected Output:

python+-------------+
|num_affected |
+-------------+
|           1 |
+-------------+

Checking File Changes

python
spark.sql(f"""
    SELECT file_path, partition FROM iceberg_catalog.{prod_db_nm}.customer_cow.files
""").show()

Expected Output:

python+----------------------+------------+
| file_path            | partition  |
+----------------------+------------+
| /tmp/.../customer_cow/data/0001.parquet | 2024-01-10 |
| /tmp/.../customer_cow/data/0002.parquet | 2024-02-15 |
+----------------------+------------+

Visual Walkthrough: Iceberg COW File Changes

Initial State (Before COW Write)

Snapshot 0: The table currently has only one snapshot (S0), which tracks the table's current state.

Manifest List 1: The manifest list points to a single manifest file (MF-1) that stores metadata about data files.

Manifest File 1: Tracks column statistics (e.g., min/max values) and points to data files.

Data Files:

Partition 1 contains F1 and F2.
Partition 2 contains F3 and F4.

No Updates Yet: The data files remain untouched, and the structure is simple.

Post-Write State (After COW Write)

New Snapshot 1: Iceberg creates a new snapshot (1) that tracks the latest table state.

New Manifest List 2: The new manifest list (2) replaces 1, pointing to the updated files.

New Manifest File 2: A new manifest file (2) is generated, reflecting the updated column statistics.

Rewritten Data Files:

Partition 1's data files are completely rewritten (1' and 2').
Partition 2 remains unchanged since no updates occurred there.

No Delete Files: Unlike Merge-On-Read (MOR), COW does not create delete files; instead, the entire affected data files are rewritten.

Warnings & Edge Cases for COW

Storage Growth: Every update recreates entire data files. Frequent updates will cause storage to grow exponentially.
Concurrency Issues: Ensure no other process modifies the same table simultaneously to avoid unnecessary rewrites.
Partition Awareness: If updates are localized to specific partitions, optimize storage by partitioning effectively.

2. Merge-On-Read (MOR) in Apache Iceberg

How Merge-On-Read Works

Updates and deletes create delete files instead of rewriting entire data files.
During reads, the delete files are applied on the fly, filtering out the outdated records.
This reduces write amplification but can degrade read performance due to extra processing.

Pros

Efficient Writes: Updates and deletes are written as separate files, avoiding the need for full file rewrites, which makes writes faster.
Lower Storage Overhead: Since only small delete/update files are created instead of rewriting entire data files, storage usage grows at a slower rate.
Better for Frequent Updates: Suitable for workloads with continuous updates, such as streaming data ingestion, CDC (Change Data Capture), and real-time ETL.
Flexible Schema Evolution: MOR allows incremental changes without significantly impacting existing data structures.
Reduced Write Amplification: Since only changed rows are written separately, there is less I/O cost compared to full file rewrites in Copy-On-Write (COW).

Cons

Slower Read Performance: Queries must apply delete files dynamically, which increases read latency and can degrade query speed, especially for analytical workloads.
Accumulation of Delete Files: Over time, a large number of delete/update files can accumulate, increasing metadata overhead and making queries slower.
Requires Periodic Compaction: To maintain query performance, Iceberg tables using MOR require background compaction jobs to merge delete files into base data files.
Higher Read Complexity: Queries need to merge data files and delete files dynamically, leading to increased processing overhead.
Potentially High Latency for Large Deletes: If a significant portion of data is deleted frequently, the number of delete files can grow rapidly, making queries inefficient until compaction occurs.

PySpark Example: Creating a MOR Table

python
spark.sql(f"""
    CREATE TABLE IF NOT EXISTS iceberg_catalog.{prod_db_nm}.customer_mor (
        customer_id BIGINT,
        name STRING,
        age INT,
        signup_date DATE
    )
    TBLPROPERTIES (
        'write.format.default' = 'parquet',
        'write.delete.mode' = 'merge-on-read',
        'write.update.mode' = 'merge-on-read',
        'write.merge.mode' = 'merge-on-read',
        'format-version' = '2'
    )
    LOCATION '{file_dir}/{prod_db_dir}/customer_mor'
""")

Deleting a Record in MOR Table

python
spark.sql(f"""
    DELETE FROM iceberg_catalog.{prod_db_nm}.customer_mor 
    WHERE customer_id = 2
""")

Expected Output:

python+-------------+
| num_affected |
+-------------+
|           1 |
+-------------+

Checking File Changes

python
spark.sql(f"""
    SELECT file_path FROM iceberg_catalog.{prod_db_nm}.customer_mor.files
""").show()

Expected Output:

python+-------------------------+
| file_path               |
+-------------------------+
| /tmp/.../customer_mor/data/0001.parquet |
| /tmp/.../customer_mor/data/delete_0001.parquet |
+-------------------------+

Visual Walkthrough: Iceberg MOR File Changes

Initial State (Before MOR Write)

This represents an Iceberg table before any updates or deletions occur. The table has a single snapshot (0) referencing existing data files.

Post-Write State (After MOR Write)

Unlike COW, Merge-On-Read does not rewrite entire files when an update or delete occurs. Instead:

Deleted rows are tracked in delete files.
Newly updated rows are written to new data files.
During reads, delete files are applied to filter out old data.

Here’s what’s happening in more detail:

	Initial State (Before MOR Write)	After MOR Write
Snapshots	Only 0 exists	A new 1 is created
Manifest Lists	Points to 1	A new manifest list 2 is generated
Manifest Files	Tracks original column stats and data files	A new manifest file 2 replaces 1
Data Files	Original files remain unchanged	New updated records written separately
Delete Tracking	No delete files	Delete files (1) are added
Performance Impact	Faster reads, no rewrites	Reads may slow down due to applying delete files

Warnings & Edge Cases for MOR

Read Performance Impact: Queries must filter out delete files, increasing read latency.
Storage Management: Although updates don’t rewrite full files, multiple delete files may accumulate over time.
Compaction: Periodically run Iceberg compaction jobs to merge delete files into base data.

Wrapping Up: Iceberg COW vs MOR

By understanding these trade-offs, you can fine-tune Iceberg tables for optimal performance and cost efficiency. Always benchmark both strategies based on query performance and storage impact before making a decision.

Share this article

Table of Contents

Start Building For Free

About the author

Dani PálmaHead of Data & Marketing

Dani is a data professional with a rich background in data engineering and real-time data platforms. At Estuary, Daniel focuses on promoting cutting-edge streaming solutions, helping to bridge the gap between technical innovation and developer adoption. With deep expertise in cloud-native and streaming technologies, Dani has successfully supported startups and enterprises in building robust data solutions.

Apache Iceberg Copy-On-Write (COW) vs Merge-On-Read (MOR): A Deep Dive

Apache Iceberg COW vs MOR: What’s the Difference?

1. Copy-On-Write (COW) in Apache Iceberg

How Copy-On-Write Works

Pros:

Cons:

PySpark Example: Creating a COW Table

Inserting Data into the Table

Updating a Record in COW Table

Checking File Changes

Visual Walkthrough: Iceberg COW File Changes

Initial State (Before COW Write)

Post-Write State (After COW Write)

Warnings & Edge Cases for COW

2. Merge-On-Read (MOR) in Apache Iceberg

How Merge-On-Read Works

Pros

Cons

PySpark Example: Creating a MOR Table

Deleting a Record in MOR Table

Checking File Changes

Visual Walkthrough: Iceberg MOR File Changes

Initial State (Before MOR Write)

Post-Write State (After MOR Write)

Warnings & Edge Cases for MOR

Wrapping Up: Iceberg COW vs MOR

Start streaming your data for free

About the author

Popular Articles

ChatGPT for Sales Conversations: Building a Smart Dashboard

Why You Should Reconsider Debezium: Challenges and Alternatives

Don't Use Kafka as a Data Lake. Do This Instead.

Streaming Pipelines.

Simple to Deploy.

Simply Priced.