oracleKafka

15 min read

Last updated: August 4, 2025

Oracle to Kafka: 2 Easy Methods for Real-Time Data Streaming

Learn how to stream data from Oracle to Kafka using two easy methods—Estuary Flow and Kafka Connect. Compare features, setup steps, and choose the best approach for real-time data integration.

Jeffrey Richman

Share this article

Large-scale businesses generate vast amounts of data daily. For over 50 years, Oracle has been a trusted database many organizations use to store and manage their enterprise data. However, organizations need to find ways to use their data beyond its original setup to derive timely insights and make strategic decisions. This is where Change Data Capture (CDC) comes in.

CDC lets you capture and transfer real-time data changes from Oracle to other platforms, ensuring up-to-date and accessible information. Apache Kafka, a popular event streaming platform, plays a key role in providing reliable and continuous data movement in this CDC process. With Kafka, you can effectively process, store, and distribute Oracle data to various destinations for advanced analytics and business intelligence.

This article explores two effective methods to help your organization with real-time data streaming from Oracle CDC to Apache Kafka. Before getting started, let’s understand the capabilities of both platforms. If you’re ready to set up the streaming workflow, skip to the Oracle to Kafka integration approaches.

A Brief Overview of Oracle

Oracle is a powerful cloud-based relational database management system (RDBMS) developed by Oracle Corporation. It is built to store, manage, and process large volumes of business-critical data. With its robust architecture and enterprise-grade features, Oracle has become the most popular RDBMS across industries; it supports applications requiring scalability, high availability, and fault tolerance.

One of Oracle's notable features is Automatic Storage Management (ASM). It is an integrated, high-performance database file system and disk manager that allows you to streamline storage management in Oracle databases. Instead of requiring manual administration of potentially thousands of database files, ASM automates storage allocation and optimization. This is done by organizing disks into different disk groups, ensuring efficient data distribution.

Oracle utilizes multi-version concurrency control (MVCC) to improve database performance further. It supports more than one user in accessing the database simultaneously without conflicts, ensuring data consistency in a multiuser environment. Oracle’s advanced indexing techniques, like B-tree, bitmap, and function-based indexes, also enhance efficiency by accelerating data retrieval and optimizing query processing.

A Brief Overview of Kafka

Kafka, or Apache Kafka, is a distributed streaming platform initially developed by LinkedIn and later open-sourced under the Apache Software Foundation. It allows you to publish, store, process, and subscribe to streams of records in real-time. This makes Kafka ideal for building real-time data pipelines and event-driven applications.

Fundamentally, Kafka follows a publish-subscribe model for building a messaging system or queue. In this approach, you interact with a Kafka cluster of multiple brokers that efficiently manage real-time data with minimal latency. Brokers serve as the backbone of Kafka in storing and delivering data through topics. These topics are logical channels where producers (publishers) send messages to Kafka topics, and consumers (subscribers) read messages from these topics instantly.

Some key features of Kafka include:

Exactly-Once Support: Kafka supports exactly-once delivery in Kafka Streams using transactional producers and consumers; you can transfer and process data between Kafka topics without duplication or data loss. If you need exactly once-delivery for other producers and consumer systems, you can use Kafka Connect’s automatic offset management to simplify the process.
Offsite Replication: To enhance disaster recovery, Kafka lets you replicate data between geographically distributed Kafka clusters using tools like Kafka MirrorMaker 2.0. This tool is designed to mirror data across Apache Kafka clusters in real-time.
Scalability: Kafka supports horizontal scalability through its partitioning mechanism, which helps you split the topics into separate partitions. Each partition represents an ordered, immutable sequence of messages. When more partitions are added to a Kafka topic, multiple consumers can read data from the same topic, distributing workload efficiently.
High Throughput: By utilizing Kafka, you can handle millions of messages per second with minimal latency, making it suitable for high-speed data streaming. This is achieved using zero-copy technology, which allows direct data transfer from the disk to the network buffer. By bypassing unnecessary memory copies, Kafka reduces I/O overhead, optimizes resource utilization, and increases the message processing rates.

Migrate Data From Oracle to Any Destination in Real-time

How to Stream Data from Oracle to Kafka (2 Methods)

For effective Oracle to Kafka streaming, you can use one of the following methods:

The Easy Way: Automate Oracle to Kafka integration using Estuary Flow.
The Manual Method: Manually move data from Oracle to Kafka using Kafka Connect JDBC Connector.

Method 1: Automate Oracle to Kafka Integration Using Estuary Flow

Estuary Flow is a real-time data integration tool that enables you to build ETL, ELT, CDC, batch, and streaming pipelines in minutes. With Estuary’s user-friendly interface, you can capture data from various databases, apply transformations using streaming SQL, and deliver the data to your chosen destination.

As you capture data, Flow automatically organizes each stream into a reusable collection with unlimited storage. This durable, append-only transaction log is stored securely in your private account. Due to this, you can enforce custom security rules and encryption for complete control over your data.

Here are some of Estuary Flow’s popular features:

Change Data Capture (CDC): Estuary Flow’s end-to-end streaming CDC feature allows you to perform incremental data loads. It allows you to capture changes made to the source and replicate them to a destination. The entire real-time data transfer process executes with a sub-100 millisecond latency.
No-Code Connectors: Estuary Flow streamlines data integration workflows by leveraging its 200+ pre-built connectors. These connectors effortlessly connect multiple sources and destinations without writing code, reducing the need for manual pipeline configuration.
Multi-Cloud Deployment: Estuary Flow provides three deployment modes—Public Deployment, Private Deployment, and BYOC (Bring Your Own Cloud) for secure, scalable, and personalized data integration requirements. The Public option is fully managed, while the Private version allows you to configure Estuary’s data infrastructure in your private environment. The BYOC plan offers additional control and security, enabling you to deploy the platform in your cloud environment.

Let’s see how to stream data from Oracle to Kafka via Estuary Flow. Before you begin, make sure the following prerequisites are in place:

An active Estuary Flow account.

For Oracle:

Create a dedicated Estuary Flow user with read-only access for the tables to replicate.
Install Oracle 11g version or higher.
Enable connections from Estuary Flow to your Oracle database if they are in separate virtual private clouds.

If you are unable to utilize Logminer with your Oracle instance, you may instead want to consider Estuary’s batch Oracle connector.

For Kafka:

A Kafka cluster with an authentication mechanism, bootstrap.servers configured, and connection security enabled via TLS.
For the Avro message format, you will also need a schema registry, with username and password for authentication.

If you ultimately only want to send your data to a Kafka consumer, consider using an Estuary Dekaf connector instead. This would simplify setup by removing the requirement of maintaining your own Kafka broker and schema registry.

Step 1: Configure Oracle as Your Source

Sign in to your Estuary Flow account.
Select the Sources option from the left navigation pane of the Estuary home page.
From the Sources page, click the + NEW CAPTURE button and search for Oracle using the Search connectors field.

Search for Oracle in the Estuary dashboard

In the search results, choose the Oracle Database (Real-time, first party) connector and click its Capture button.

On the Oracle database connector configuration page, you must specify a unique source capture name in the Name field within the Capture Details section.

Expand the Endpoint Config section and enter the necessary information in the Server Address, User, Password, and Database fields.
You can also configure advanced options like Backfill Chunk Size, Incremental Chunk Size, and Incremental SCN Range.
After providing relevant details, click the NEXT > SAVE AND PUBLISH.

Upon completion of configuration, the connector will capture the data from the Oracle database into a Flow collection.

Step 2: Configure Apache Kafka as Your Destination

Once your Oracle data capture is complete, a dialog box will show up with the capture details. You can click the MATERIALIZE COLLECTIONS option in this dialog box to start configuring the destination connector.

An alternative way is to navigate to the Estuary Flow home page and click Destinations > + NEW MATERIALIZATION.

Search for Kafka in the Estuary dashboard

On the Create Materialization page, type Kafka in the Search connectors box.
You will see the Apache Kafka Real-time connector in the search results. Click the connector’s Materialization button.

Creating a Kafka materialization connector

On the Create Materialization page, specify a unique materialization name in the Name field.

Kafka materialization endpoint configuration

Expand the Endpoint Config section and enter the required details like Bootstrap Servers, SASL credentials, and Schema Registry information.

Link to a capture in the materialization

Typically, your Oracle capture data should be automatically added to your Kafka materialization. If not, you can manually link it by clicking the SOURCE FROM CAPTURE button in the Source Collections section and selecting the desired capture.
Click NEXT > SAVE AND PUBLISH to complete your Kafka materialization configuration.

The Apache Kafka connector will materialize your Oracle data Flow collections into Kafka topics, completing the Oracle to Kafka integration.

Benefits of Using Estuary Flow

Some key benefits of utilizing Estuary Flow during Oracle to Kafka streaming:

Simplified Redo Log Management

Oracle redo log files assist in recording all database changes to achieve durability and recovery. However, using redo logs for Oracle CDC to Kafka has several limitations.

As redo logs grow continuously, they require high storage. Without proper retention policies, excessive disk usage becomes an issue, and the CDC still needs to maintain a sufficient history of this. Besides this, redo logs store low-level binary data, requiring tools like Oracle LogMiner to decode changes and extract meaningful information.

How Estuary Flow Helps:

With Estuary’s Oracle source connector, you can simplify the redo file management. The connector uses Oracle LogMiner, enabling you to analyze, track, and retrieve database changes from redo log files effortlessly. You can then convert the captured changes into Flow collections and publish them to Kafka topics. This guarantees a streamlined CDC process without the complexities of manual log handling.

Efficient SCN-Based CDC Handling

SCN or System Change Number is a unique identifier assigned to each committed transaction in Oracle. These SCNs help maintain consistent and ordered replication in Oracle CDC workflows.

However, managing SCN-based CDC is challenging because capturing too many changes in a single iteration can overload memory and processing resources. In contrast, a small SCN range may slow down replication and increase latency.

How Estuary Flow Helps:

In Estuary Flow, the Oracle source connector retrieves changes within an SCN range of 50,000 by default. If your Oracle database processes a high volume of events or has low activity, you can adjust the SCN range using the advanced.incremental_scn_range option. By automatically managing SCN-based CDC, Estuary Flow assures that Oracle database changes are delivered to Kafka in real-time with minimal latency.

Method 2: Manually Move Data from Oracle to Kafka Using Kafka Connect JDBC Connector

Kafka Connect is a data integration framework that facilitates data streaming between Apache Kafka and various external systems like Oracle.

To simplify the Oracle to Kafka data streaming process, you can use platforms like Confluent. It offers connectors for external systems, including Oracle Database, Postgres, and RabbitMQ.

The Kafka Connect approach uses snapshots of existing data in the Oracle database and tracks all the row-level changes made to the original data. All the source data updates are recorded in a Kafka topic, which the consumer application can refer to perform event-driven operations.

Before using the Oracle source connector in Confluent, ensure that you satisfy the following prerequisites:

You must install and configure Confluent CLI for the Kafka cluster.
The Oracle database version must be 11.2.0.4 or later and configured with a Pluggable Database (PDB) service name.
To use Schema registry-based formats, like Avro, Protobuf, and JSON_SR, you must enable Schema Registry.
Access to Kafka cluster credentials for authentication.

Step 1: List All the Available Connectors

To check all the available connectors offered by Confluent, open Confluent CLI and run the following command:

confluent connect plugin list

Step 2: Display the Connector Configuration Properties

List the Oracle database source connector’s configuration properties:

confluent connect plugin describe <OracleDatabaseSource>

The above code will respond with the properties required to configure the connector.

Step 3: Define the Connector Configuration Properties

In this step, you can create a JSON file, oracle_source.json, using all the connector configuration properties. Here’s an example:

javascript{

  "name": "OracleDatabaseSource_0",

  "connector.class": "OracleDatabaseSource",

  "kafka.auth.mode": "KAFKA_API_KEY",

  "kafka.api.key": "<my-kafka-api-key>",

  "kafka.api.secret": "<my-kafka-api-secret>",

  "topic.prefix": "oracle_",

  "connection.host": "<my-database-endpoint>",

  "connection.port": "1521",

  "connection.user": "<database-username>",

  "connection.password": "<database-password>",

  "db.name": "db078_pdb1.subnet.vcn.oraclevcn.com",

  "table.whitelist": "PASSENGERS",

  "timestamp.column.name": "created_at",

  "output.data.format": "JSON",

  "db.timezone": "UTC",

  "tasks.max": "1"

}

You must replace the placeholders with proper access credentials.

name: Provide a name for the connector.
connector.class: Identifies the JDBC source connector for Oracle.
kafka.auth.mode: For the specification of authentication mode, which can be SERVICE_ACCOUNT or KAFKA_API_KEY. You can mention the configuration properties kafka.api.key and kafka.api.secret for API key authentication. To use the service account, specify the Resource ID in the kafka.service.account.id property.
topic.prefix: Provide a topic prefix. It defines the topic prefix using the <topic.prefix><tableName> naming convention. The tables have the properties configured to topic.creation.default.partitions=1 and topic.creation.default.replication.factor=3. To create custom properties, you must create topics with specific settings before running this connector.
output.data.format: Specify the Kafka data format, which can be Avro, JSON_SR, PROTOBUF, or JSON as valid entries.
db.timezone: Mention the database timezone. The default value of this field is set to UTC.

Step 4: Create the Connector Using the Properties File

To load the configuration properties and initialize the connector, execute:

confluent connect cluster create --config-file oracle-source.json

This command must result in a success message.

Step 5: Check the Connector Status

Verify if the connector works as expected by running the following:

confluent connect cluster list

This code will create a table-like structure with various columns. The columns will include a connector ID, connector name, the status column set to Running, and the connector type.

Finally, you can verify whether the messages populate in the Kafka topic. A successful pipeline setup will reflect all the changes made to the Oracle database in the Kafka topic.

Why struggle with complex Kafka connectors or manual configurations? With Estuary Flow, you can set up a real-time Oracle to Kafka pipeline in just a few clicks—no coding required!

Get started today! Try Estuary Flow for free or book a demo with our experts to see it in action.

Oracle to Kafka: Comparing Estuary Flow vs. Kafka Connect

Here is the detailed comparative analysis of Estuary Flow and JDBC source connector used for Oracle to Kafka integration:

Data Integration Setup

Setting up Confluent’s Oracle JDBC source connector involves complex configurations and coding with continuous maintenance overhead.

On the other hand, by leveraging Estuary Flow’s Oracle and Kafka connectors, you do not need to worry about resource management. Once you configure these connectors, it ensures real-time data replication from the Oracle database to Kafka topics.

Automation Capabilities

Although building streaming data pipelines using Confluent CLI is effective, it lacks automation capabilities. Manually defining connectors and creating JSON configuration files can lead to errors that can be time-consuming to resolve.

In contrast, Estuary Flow provides robust automation capabilities, helping you directly capture changes from Oracle’s redo logs without manual intervention. This makes Estuary Flow a more efficient and scalable alternative to the JDBC connector.

Extensibility and Data Transformation

Confluent’s Oracle to Kafka Connector is highly customizable but requires manual configuration and external connectors for advanced data transformation capabilities. You need to write custom Single Message Transforms (SMTs) or use tools like kSQLDB to modify data before pushing it into Kafka topics.

On the other hand, Estuary Flow allows you to perform real-time data transformations via SQL or TypeScript derivations. These derivations assist in changing the data after it is captured and before it's processed to Kafka topics. They facilitate operations like aggregations, joins, windowing, and unnesting within data streams.

Conclusion: The Best Way to Stream Data from Oracle to Kafka

Streaming Oracle data to Kafka is essential for facilitating real-time analytics and event-driven software development.

By creating a streaming data pipeline, you can identify the changes made to the Oracle database and replicate them in specific Kafka Topics. These updates can then be consumed by the consumer applications to perform operations like sending email notifications to customers who recently completed a user survey.

You can use Kafka Connect with JDBC for streaming Oracle to Kafka. However, it requires manual configuration of connector properties, adding to the setup complexity. Instead, a CDC-enabled solution like Estuary Flow can simplify the process. This platform automates streaming data and change tracking without requiring any additional tools.

Get in touch with the experts at Estuary to explore how this platform can fit into your data streaming workflows effortlessly.

FAQs

Why is Kafka used to stream data from Oracle?

Kafka allows real-time data streaming by efficiently ingesting, processing, and distributing data at scale. It helps in building mission-critical applications, analytics pipelines, and live monitoring systems.

How do schema changes in Oracle impact Kafka streaming?

Schema changes in Oracle lead to ingestion failures in Kafka Connect if not managed properly. Using a Schema Registry with Avro, Protobuf, or JSON schema aids in handling schema evolution smoothly. With Estuary Flow, you can ensure automatic schema evolution.

Related Articles

Share this article

Table of Contents

Start Building For Free

About the author

Jeffrey Richman

With over 15 years in data engineering, a seasoned expert in driving growth for early-stage data companies, focusing on strategies that attract customers and users. Extensive writing provides insights to help companies scale efficiently and effectively in an evolving data landscape.

Oracle to Kafka: 2 Easy Methods for Real-Time Data Streaming

A Brief Overview of Oracle

A Brief Overview of Kafka

How to Stream Data from Oracle to Kafka (2 Methods)