Implementing the CQRS Design Pattern with Debezium for Seamless Data Replication

Software Architecture

Implement the CQRS design pattern for microservices. Learn data replication strategies with Debezium's Change Data Capture, enabling real-time, seamless integration across diverse databases.

CQRS Design Pattern

November 28, 2025 by Vojtěch Juránek

Command Query Responsibility Segregation (CQRS) is a design pattern that separates read and write operations to and from a database. It can also be used to aggregate data from multiple sources into a single, centralized data store. Although commonly used in microservice architectures, this pattern is not limited to them. In this blog post, we will explore how to implement CQRS in different setups and how Debezium can simplify its implementation.

Understanding the CQRS Design Pattern

In traditional architectures, both read and write operations are handled by the same application, sharing a unified database and data model. As applications scale, this approach can become a bottleneck due to varying performance characteristics of reads and writes, potential lock contention, and increased database load from diverse field requirements. A shared database also often necessitates elevated permissions, introducing security risks.

Microservice architectures commonly employ the Database per Service pattern, where each microservice manages its own database. However, workflows and user interfaces frequently require combining data from multiple microservices. This can be challenging, leading to delayed responses if a single service is slow. An aggregating database that continuously collects data from all microservice databases and serves as a read-only store for complex queries often proves more efficient.

All these challenges can be addressed with the CQRS design pattern. CQRS separates reads and writes into distinct data models, stored in separate databases, and handled by different parts of the application—or even entirely separate microservices. The following diagram illustrates this concept:

Benefits and Challenges

The database serving read operations is typically configured as read-only, which reduces the attack surface and improves system security. This read database can be a replica of the write database or a completely different type of database, optimized for read queries (e.g., a time-series database) or designed to aggregate data from multiple sources. While both read and write data models can reside within the same database instance, using separate databases is far more common. This allows each database to be tuned and scaled independently according to its specific workload.

In a microservice environment, different teams often develop and deploy microservices independently. Synchronizing data schemas and managing format changes can be complex. The CQRS pattern mitigates this by decoupling data models between services, allowing each service to evolve its schema independently without tight coordination across teams.

Of course, adopting CQRS also introduces challenges, primarily increased complexity in deployment and data management. Replicating data between databases often leads to an eventual consistency model, meaning newly written data might not be immediately available on the read side. However, in most microservice architectures, this short delay is acceptable and rarely causes significant issues.

Possible Deployment Architectures

The main challenge in implementing the CQRS pattern is replicating data from the write database to the read database. There are three typical deployment scenarios:

  1. Same Database Instance: Both data models exist in the same database. This is simple to implement as no replication is needed, but it forfeits most CQRS benefits and is rarely used in production.
  2. Same Database Type, Built-in Replication: Using the database's native replication capabilities (if supported). This is a common setup, offering robust configuration options, including strong consistency if required.
  3. Heterogeneous Environments, Third-Party Replication: When native replication isn't sufficient (e.g., different database types, aggregating multiple sources), a third-party tool is needed. Change Data Capture (CDC) with Debezium offers a convenient and reliable approach, capturing real-time database changes and propagating them to target databases.

In this blog post, we will focus on the latter two approaches: replication via database-native mechanisms and replication using Debezium.

The Demo Application

Use Case

To demonstrate data replication within the CQRS pattern, we'll use a simple voting application. Users can create polls with options via a REST API and cast votes. The application also provides a REST API to view poll results, displayed engagingly through a simple web UI.

The application is split into two microservices, each with its own database. The first microservice exposes a REST API for creating polls and casting votes, writing data to its database. This poll and voting data is then replicated to a second database instance, which serves the second microservice responsible for displaying poll results.

Application Source Code

You can find the application source code in the Debezium example repository. The application uses the Quarkus framework and its REST module for REST APIs, along with Panache and Hibernate ORM for persistence. The first microservice uses a one-to-one mapping between entities and database tables, with Panache handling most persistence logic. The second microservice, which executes custom SQL queries to read poll results, uses Hibernate directly.

Integrating the UI is also simple with Quarkus. The application leverages Quarkus Quinoa to integrate a lightweight Node.js web application using ReactJS. The frontend periodically polls the REST API for the latest results and updates the displayed chart. Quarkus keeps the codebase compact and significantly reduces boilerplate.

CQRS with Database Built-in Replication

The simplest CQRS implementation is when your database offers built-in replication. Our example uses PostgreSQL, which supports streaming replication. Schematically, our demo application looks like this:

To enable streaming replication between two PostgreSQL instances, we first create a publication on the primary instance and a replication role with SELECT rights on the necessary tables:

CREATE PUBLICATION poll_pub FOR ALL TABLES;
CREATE ROLE replicator WITH REPLICATION LOGIN PASSWORD 'replicator';
GRANT SELECT ON ALL TABLES IN SCHEMA public TO replicator;

For simplicity, we granted SELECT rights on all tables. In production, these privileges should be limited to only the tables requiring replication.

Because Hibernate may delete rows during update operations, we also need to replicate DELETE operations. To do this, we configure full table replication for the synchronized tables:

ALTER TABLE OptionEntity REPLICA IDENTITY FULL;
ALTER TABLE PollEntity REPLICA IDENTITY FULL;
-- ... and other tables if needed

Since Hibernate can dynamically create tables, you might also want to grant SELECT permissions to the replication role for any future tables:

ALTER DEFAULT PRIVILEGES IN SCHEMA public GRANT SELECT ON TABLES TO replicator;

However, this is not required in our demo application, as all tables are created in advance.

On the replica database, you simply subscribe to the publication created on the primary database using the appropriate credentials:

CREATE SUBSCRIPTION poll_sub
    CONNECTION 'dbname=postgres host=postgres_primary user=replicator password=replicator'
    PUBLICATION poll_pub;

After this setup, all changes will automatically replicate from the primary database (serving the writer microservice) to the replica database (used by the reader microservice). This is a straightforward and reliable solution that requires no third-party tools, serving well until you need to integrate multiple data sources or use a different database type for the reader microservice.

CQRS in Heterogeneous Environments

In more heterogeneous environments, where different database technologies are used for the write and read sides, built-in replication mechanisms are usually insufficient. You might need to replicate data between different types of databases or aggregate data from several systems into a single, read-optimized store. In such cases, a Change Data Capture (CDC) solution like Debezium becomes extremely useful.

Debezium continuously monitors changes in a source database by reading its transaction log and publishes these changes to a user-configured destination. In our demo, we'll publish events to Apache Kafka. From Kafka, the data can be written into another database or system using a suitable Kafka Connect sink connector. For databases supporting JDBC inserts, the JDBC sink connector can be used; for MongoDB, the MongoDB sink connector is available. For other databases that don’t support JDBC inserts, you typically rely on custom sink connectors provided by the database vendor.

However, Debezium usage is not limited to the Kafka Connect runtime. Besides using a standalone Debezium server, Debezium can be embedded into a custom microservice using the Debezium embedded engine. Another option is to consume CDC events leveraging the Debezium extension for Quarkus. This is perhaps the easiest way to consume CDC events in your microservice, requiring just simple configuration and annotating appropriate consumer methods. Nevertheless, in all these cases, you still have to implement the sink part (writing the captured data to the target database) yourself.

The following diagram illustrates this setup:

For simplicity, our demo application will include Kafka Connect as the Debezium runtime. We will show two examples of using Debezium for data replication:

  1. Replicating data from PostgreSQL to MySQL using the Debezium JDBC Sink Connector.
  2. Replicating data from PostgreSQL to QuestDB using the QuestDB Kafka Sink Connector.

Replicating from PostgreSQL to MySQL

This setup demonstrates using Debezium with Kafka Connect to replicate data between two relational databases where the target database supports the JDBC API.

The source side uses the Debezium PostgreSQL connector with a minimal configuration specifying connection details and the tables to monitor:

{
  "name": "postgres-source",
  "config": {
    "connector.class": "io.debezium.connector.postgresql.PostgresConnector",
    "tasks.max": "1",
    "database.hostname": "postgres",
    "database.port": "5432",
    "database.user": "postgres",
    "database.password": "postgres",
    "database.dbname": "postgres",
    "topic.prefix": "polls",
    "table.include.list": "public.optionentity, public.pollentity, public.pollentity_optionentity, public.pollentity_voteentity, public.voteentity"
  }
}

This configuration captures all changes in the specified tables and publishes them to Kafka topics under the prefix polls.

On the sink side, we use the Debezium JDBC Sink Connector to insert these records into a MySQL database. Since the JDBC sink understands Debezium’s event format, no Single Message Transformation (SMT) is required.

{
  "name": "mysql-sink",
  "config": {
    "connector.class": "io.debezium.connector.jdbc.JdbcSinkConnector",
    "tasks.max": "1",
    "connection.url": "jdbc:mysql://mysql:3306/polls",
    "connection.username": "mysqluser",
    "connection.password": "mysqlpw",
    "insert.mode": "upsert",
    "delete.enabled": "true",
    "primary.key.mode": "record_key",
    "schema.evolution": "basic",
    "collection.name.format": "${source.table}",
    "topics": "polls.public.voteentity, polls.public.optionentity, polls.public.pollentity, polls.public.pollentity_optionentity"
  }
}

In this configuration, insert.mode is set to upsert, ensuring new inserts and updates are reflected. Deletes are also enabled, as Hibernate performs deletes when updating collections of votes. The ${source.table} placeholder ensures each target table in MySQL matches its corresponding PostgreSQL table name.

With just two simple configurations, Debezium ensures continuous data replication between two different database vendors.

Replicating from PostgreSQL to QuestDB

One of the advantages of the CQRS design pattern is the ability to use specialized databases for aggregating and reading data, which can optimize queries far better than general-purpose databases. If such a database doesn't work with the Debezium JDBC sink connector, and it's not MongoDB, it might seem like you're out of luck. Fortunately, the popularity of the Kafka Connect platform means many database vendors provide their own sink connectors.

In our second example, we'll replicate data from PostgreSQL into QuestDB, a high-performance time-series database optimized for analytical queries. To keep our demo simple and unified, we won’t use any specific QuestDB features or time-related charts (e.g., vote evolution over time), but you can easily imagine modifying the reader microservice to provide such plots. We cannot use the JDBC sink connector for QuestDB because it supports the InfluxDB Line Protocol for data ingestion. However, QuestDB provides its own sink connector for Kafka Connect, which works well with the Debezium format.

The configuration of the PostgreSQL source connector remains the same:

{
  "name": "postgres-source",
  "config": {
    "connector.class": "io.debezium.connector.postgresql.PostgresConnector",
    "tasks.max": "1",
    "database.hostname": "postgres",
    "database.port": "5432",
    "database.user": "postgres",
    "database.password": "postgres",
    "database.dbname": "postgres",
    "topic.prefix": "polls",
    "table.include.list": "public.optionentity, public.pollentity, public.pollentity_optionentity, public.pollentity_voteentity, public.voteentity"
  }
}

QuestDB’s sink connector behaves slightly differently from Debezium's sink connectors: it converts all fields in the records into table columns. Thus, as a first step, we flatten Debezium records to contain only the new record state. Debezium conveniently provides the New record state SMT, which does exactly this.

The configuration for the QuestDB sink would look like this:

{
  "name": "questdb-sink",
  "config": {
    "connector.class": "io.questdb.kafka.QuestDBSinkConnector",
    "tasks.max": "5",
    "topics": "optionentity, pollentity, pollentity_optionentity, voteentity",
    "table": "${topic}",
    "client.conf.string": "http::addr=questdb:9000;",
    "timestamp.field.name": "ts",
    "include.key": false,
    "value.converter": "org.apache.kafka.connect.json.JsonConverter",
    "value.converter.schemas.enable": true,
    "key.converter": "org.apache.kafka.connect.storage.StringConverter",
    "transforms": "unwrap",
    "transforms.unwrap.type": "io.debezium.transforms.ExtractNewRecordState"
  }
}

This setup is quite similar to the JDBC sink connector, with the additional SMT configuration for convenience and simplicity.

Application Point of View

We have demonstrated several ways to replicate data between database instances as a main building block for implementing the CQRS pattern. Regardless of the setup chosen, the end user experience should always be the same, displaying something like this when viewing poll results:

The results plot should automatically update as votes are cast. The writer microservice, which accepts votes, doesn't need to change when backend databases are swapped. The reader microservice, which displays results, requires changes and redeployment if the replica database changes. However, these changes are typically limited to database URL and connection credentials. For QuestDB, a slight modification to the SQL query might be needed due to simplified deployment merging several tables. All these aspects can be parametrized and managed using Quarkus profiles. As a result, no changes to the application source code are necessary; only recompiling the reader application with the appropriate Quarkus profile and redeploying the microservice is needed. This nicely illustrates another advantage of CQRS: decoupling reader and writer parts into separate applications/microservices allows backend changes and independent evolution of data models without tight coordination. This gives developer teams much more freedom and flexibility.

Summary

The CQRS design pattern is a powerful approach that advocates splitting an application into distinct parts: one for manipulating database data (writes) and another for reading that data (queries). This separation enables the use of different databases for each microservice, allowing for independent scaling and tuning tailored to their specific needs and workloads. This segregation also supports different data models for each microservice, facilitating fine-tuning and independent evolution. Other benefits include increased security (only write microservices need write access), improved code clarity and simplicity, and more.

In this blog post, besides introducing the pattern, we focused on the main implementation hurdle: data replication between databases. We discussed several possible architectures and described methods to ensure data replication in each setup. We began with simple database-native replication, progressed to replication between two different relational databases, and concluded with a setup where a relational database replicates to a specialized database.

We showed how Debezium effectively tackles this problem, keeping the setup and configuration simple and declarative. It enables real-time data replication between heterogeneous databases, regardless of vendor differences. This flexibility makes Debezium a perfect fit for implementing CQRS in complex, polyglot environments.

About the Author

Vojtěch Juránek is a software engineer at Red Hat and lives in the Czech Republic.


About Debezium

Debezium is an open-source distributed platform that transforms your existing databases into event streams, allowing applications to react almost instantly to every committed row-level change. Built on Kafka and providing Kafka Connect compatible connectors, Debezium monitors specific database management systems. It records the history of data changes in Kafka logs, ensuring applications can stop and restart at any time, consuming all missed events correctly and completely. Debezium is open source under the Apache License, Version 2.0.

Get Involved

We hope you find Debezium interesting and useful. Give it a try! Follow us on Twitter @debezium, chat with us on Zulip, or join our mailing list to connect with the community. All code is open source on GitHub; build it locally and help us improve existing connectors or add new ones. If you find problems or have ideas for improving Debezium, please log an issue.