Integrate Debezium with TimescaleDB
Capture and stream database changes in real time for event-driven architecture
Debezium is an open-source distributed platform for change data capture (CDC). It enables you to capture changes in a self-hosted TimescaleDB instance and stream them to other systems in real time.
Debezium can capture events about:
-
Hypertables: captured events are rerouted from their chunk-specific topics to a single logical topic named according to the following pattern:
<topic.prefix>.<hypertable-schema-name>.<hypertable-name> -
Continuous aggregates: captured events are rerouted from their chunk-specific topics to a single logical topic named according to the following pattern:
<topic.prefix>.<aggregate-schema-name>.<aggregate-name> -
Hypercore: If you enable hypercore, the Debezium TimescaleDB connector does not apply any special processing to data in the columnstore. Compressed chunks are forwarded unchanged to the next downstream job in the pipeline for further processing as needed. Typically, messages with compressed chunks are dropped, and are not processed by subsequent jobs in the pipeline.
This limitation only affects changes to chunks in the columnstore. Changes to data in the rowstore work correctly.
This page explains how to capture changes in your database and stream them using Debezium on Apache Kafka.
Prerequisites
Section titled “Prerequisites”To follow the steps on this page:
-
Create a target self-hosted TimescaleDB instance.
You need your connection details.
- Install Docker on your development machine.
Configure your database to work with Debezium
Section titled “Configure your database to work with Debezium”To set up self-hosted TimescaleDB to communicate with Debezium:
-
Configure your self-hosted PostgreSQL deployment
-
Open
postgresql.conf.The PostgreSQL configuration files are usually located in:
- Docker:
/home/postgres/pgdata/data/ - Linux:
/etc/postgresql/<version>/main/or/var/lib/pgsql/<version>/data/ - MacOS:
/opt/homebrew/var/postgresql@<version>/ - Windows:
C:\Program Files\{C.PG}\<version>\data\
- Docker:
-
Enable logical replication.
Modify the following settings in
postgresql.conf:wal_level = logicalmax_replication_slots = 10max_wal_senders = 10 -
Open
pg_hba.confand enable host replication.To allow replication connections, add the following:
local replication debezium trustThis permission is for the
debeziumPostgreSQL user running on a local or Docker deployment. For more about replication permissions, see Configuring PostgreSQL to allow replication with the Debezium connector host. -
Restart PostgreSQL.
-
-
Connect to your self-hosted TimescaleDB instance
Use
psql. -
Create a Debezium user in PostgreSQL
Create a user with the
LOGINandREPLICATIONpermissions:CREATE ROLE debezium WITH LOGIN REPLICATION PASSWORD '<debeziumpassword>'; -
Enable a replication spot for Debezium
-
Create a hypertable for Debezium to listen to:
CREATE TABLE accounts (created_at TIMESTAMPTZ DEFAULT NOW(),name TEXT,city TEXT) WITH (tsdb.hypertable);Debezium also works with continuous aggregates.
-
Create a publication and enable a replication slot:
CREATE PUBLICATION dbz_publication FOR ALL TABLES WITH (publish = 'insert, update');
-
Configure Debezium to work with your database
Section titled “Configure Debezium to work with your database”Set up Kafka Connect server, plugins, drivers, and connectors:
-
Run Zookeeper in Docker
In another Terminal window, run the following command:
Terminal window docker run -it --rm --name zookeeper -p 2181:2181 -p 2888:2888 -p 3888:3888 quay.io/debezium/zookeeper:3.0Check the output log to see that zookeeper is running.
-
Run Kafka in Docker
In another Terminal window, run the following command:
Terminal window docker run -it --rm --name kafka -p 9092:9092 --link zookeeper:zookeeper quay.io/debezium/kafka:3.0Check the output log to see that Kafka is running.
-
Run Kafka Connect in Docker
In another Terminal window, run the following command:
Terminal window docker run -it --rm --name connect \-p 8083:8083 \-e GROUP_ID=1 \-e CONFIG_STORAGE_TOPIC=accounts \-e OFFSET_STORAGE_TOPIC=offsets \-e STATUS_STORAGE_TOPIC=storage \--link kafka:kafka \--link timescaledb:timescaledb \quay.io/debezium/connect:3.0Check the output log to see that Kafka Connect is running.
-
Register the Debezium PostgreSQL source connector
Update the
<properties>for the<debezium-user>you created in your self-hosted TimescaleDB instance in the following command. Then run the command in another Terminal window:Terminal window curl -X POST http://localhost:8083/connectors \-H "Content-Type: application/json" \-d '{"name": "timescaledb-connector","config": {"connector.class": "io.debezium.connector.postgresql.PostgresConnector","database.hostname": "timescaledb","database.port": "5432","database.user": "<debezium-user>","database.password": "<debezium-password>","database.dbname" : "postgres","topic.prefix": "accounts","plugin.name": "pgoutput","schema.include.list": "public,_timescaledb_internal","transforms": "timescaledb","transforms.timescaledb.type": "io.debezium.connector.postgresql.transforms.timescaledb.TimescaleDb","transforms.timescaledb.database.hostname": "timescaledb","transforms.timescaledb.database.port": "5432","transforms.timescaledb.database.user": "<debezium-user>","transforms.timescaledb.database.password": "<debezium-password>","transforms.timescaledb.database.dbname": "postgres"}}' -
Verify
timescaledb-source-connectoris included in the connector list- Check the tasks associated with
timescaledb-connector:You see something like:Terminal window curl -i -X GET -H "Accept:application/json" localhost:8083/connectors/timescaledb-connectorTerminal window {"name":"timescaledb-connector","config":{ "connector.class":"io.debezium.connector.postgresql.PostgresConnector","transforms.timescaledb.database.hostname":"timescaledb","transforms.timescaledb.database.password":"debeziumpassword","database.user":"debezium","database.dbname":"postgres","transforms.timescaledb.database.dbname":"postgres","transforms.timescaledb.database.user":"debezium","transforms.timescaledb.type":"io.debezium.connector.postgresql.transforms.timescaledb.TimescaleDb","transforms.timescaledb.database.port":"5432","transforms":"timescaledb","schema.include.list":"public,_timescaledb_internal","database.port":"5432","plugin.name":"pgoutput","topic.prefix":"accounts","database.hostname":"timescaledb","database.password":"debeziumpassword","name":"timescaledb-connector"},"tasks":[{"connector":"timescaledb-connector","task":0}],"type":"source"}
- Check the tasks associated with
-
Verify
timescaledb-connectoris running-
Open the Terminal window running Kafka Connect. When the connector is active, you see something like the following:
Terminal window 2025-04-30 10:40:15,168 INFO Postgres|accounts|streaming REPLICA IDENTITY for '_timescaledb_internal._hyper_1_1_chunk' is 'DEFAULT'; UPDATE and DELETE events will contain previous values only for PK columns [io.debezium.connector.postgresql.PostgresSchema]2025-04-30 10:40:15,168 INFO Postgres|accounts|streaming REPLICA IDENTITY for '_timescaledb_internal.bgw_job_stat' is 'DEFAULT'; UPDATE and DELETE events will contain previous values only for PK columns [io.debezium.connector.postgresql.PostgresSchema]2025-04-30 10:40:15,175 INFO Postgres|accounts|streaming SignalProcessor started. Scheduling it every 5000ms [io.debezium.pipeline.signal.SignalProcessor]2025-04-30 10:40:15,175 INFO Postgres|accounts|streaming Creating thread debezium-postgresconnector-accounts-SignalProcessor [io.debezium.util.Threads]2025-04-30 10:40:15,175 INFO Postgres|accounts|streaming Starting streaming [io.debezium.pipeline.ChangeEventSourceCoordinator]2025-04-30 10:40:15,176 INFO Postgres|accounts|streaming Retrieved latest position from stored offset 'LSN{0/1FCE570}' [io.debezium.connector.postgresql.PostgresStreamingChangeEventSource]2025-04-30 10:40:15,176 INFO Postgres|accounts|streaming Looking for WAL restart position for last commit LSN 'null' and last change LSN 'LSN{0/1FCE570}' [io.debezium.connector.postgresql.connection.WalPositionLocator]2025-04-30 10:40:15,176 INFO Postgres|accounts|streaming Initializing PgOutput logical decoder publication [io.debezium.connector.postgresql.connection.PostgresReplicationConnection]2025-04-30 10:40:15,189 INFO Postgres|accounts|streaming Obtained valid replication slot ReplicationSlot [active=false, latestFlushedLsn=LSN{0/1FCCFF0}, catalogXmin=884] [io.debezium.connector.postgresql.connection.PostgresConnection]2025-04-30 10:40:15,189 INFO Postgres|accounts|streaming Connection gracefully closed [io.debezium.jdbc.JdbcConnection]2025-04-30 10:40:15,204 INFO Postgres|accounts|streaming Requested thread factory for component PostgresConnector, id = accounts named = keep-alive [io.debezium.util.Threads]2025-04-30 10:40:15,204 INFO Postgres|accounts|streaming Creating thread debezium-postgresconnector-accounts-keep-alive [io.debezium.util.Threads]2025-04-30 10:40:15,216 INFO Postgres|accounts|streaming REPLICA IDENTITY for '_timescaledb_internal.bgw_policy_chunk_stats' is 'DEFAULT'; UPDATE and DELETE events will contain previous values only for PK columns [io.debezium.connector.postgresql.PostgresSchema]2025-04-30 10:40:15,216 INFO Postgres|accounts|streaming REPLICA IDENTITY for 'public.accounts' is 'DEFAULT'; UPDATE and DELETE events will contain previous values only for PK columns [io.debezium.connector.postgresql.PostgresSchema]2025-04-30 10:40:15,217 INFO Postgres|accounts|streaming REPLICA IDENTITY for '_timescaledb_internal.bgw_job_stat_history' is 'DEFAULT'; UPDATE and DELETE events will contain previous values only for PK columns [io.debezium.connector.postgresql.PostgresSchema]2025-04-30 10:40:15,217 INFO Postgres|accounts|streaming REPLICA IDENTITY for '_timescaledb_internal._hyper_1_1_chunk' is 'DEFAULT'; UPDATE and DELETE events will contain previous values only for PK columns [io.debezium.connector.postgresql.PostgresSchema]2025-04-30 10:40:15,217 INFO Postgres|accounts|streaming REPLICA IDENTITY for '_timescaledb_internal.bgw_job_stat' is 'DEFAULT'; UPDATE and DELETE events will contain previous values only for PK columns [io.debezium.connector.postgresql.PostgresSchema]2025-04-30 10:40:15,219 INFO Postgres|accounts|streaming Processing messages [io.debezium.connector.postgresql.PostgresStreamingChangeEventSource] -
Watch the events in the accounts topic on your self-hosted TimescaleDB instance.
In another Terminal instance, run the following command:
Terminal window docker run -it --rm --name watcher --link zookeeper:zookeeper --link kafka:kafka quay.io/debezium/kafka:3.0 watch-topic -a -k accountsYou see the topics being streamed. For example:
Terminal window status-task-timescaledb-connector-0 {"state":"RUNNING","trace":null,"worker_id":"172.17.0.5:8083","generation":31}status-topic-timescaledb.public.accounts:connector-timescaledb-connector {"topic":{"name":"timescaledb.public.accounts","connector":"timescaledb-connector","task":0,"discoverTimestamp":1746009337985}}status-topic-accounts._timescaledb_internal.bgw_job_stat:connector-timescaledb-connector {"topic":{"name":"accounts._timescaledb_internal.bgw_job_stat","connector":"timescaledb-connector","task":0,"discoverTimestamp":1746009338118}}status-topic-accounts._timescaledb_internal.bgw_job_stat:connector-timescaledb-connector {"topic":{"name":"accounts._timescaledb_internal.bgw_job_stat","connector":"timescaledb-connector","task":0,"discoverTimestamp":1746009338120}}status-topic-accounts._timescaledb_internal.bgw_job_stat_history:connector-timescaledb-connector {"topic":{"name":"accounts._timescaledb_internal.bgw_job_stat_history","connector":"timescaledb-connector","task":0,"discoverTimestamp":1746009338243}}status-topic-accounts._timescaledb_internal.bgw_job_stat_history:connector-timescaledb-connector {"topic":{"name":"accounts._timescaledb_internal.bgw_job_stat_history","connector":"timescaledb-connector","task":0,"discoverTimestamp":1746009338245}}status-topic-accounts.public.accounts:connector-timescaledb-connector {"topic":{"name":"accounts.public.accounts","connector":"timescaledb-connector","task":0,"discoverTimestamp":1746009338250}}status-topic-accounts.public.accounts:connector-timescaledb-connector {"topic":{"name":"accounts.public.accounts","connector":"timescaledb-connector","task":0,"discoverTimestamp":1746009338251}}status-topic-accounts.public.accounts:connector-timescaledb-connector {"topic":{"name":"accounts.public.accounts","connector":"timescaledb-connector","task":0,"discoverTimestamp":1746009338251}}status-topic-accounts.public.accounts:connector-timescaledb-connector {"topic":{"name":"accounts.public.accounts","connector":"timescaledb-connector","task":0,"discoverTimestamp":1746009338251}}status-topic-accounts.public.accounts:connector-timescaledb-connector {"topic":{"name":"accounts.public.accounts","connector":"timescaledb-connector","task":0,"discoverTimestamp":1746009338251}}["timescaledb-connector",{"server":"accounts"}] {"last_snapshot_record":true,"lsn":33351024,"txId":893,"ts_usec":1746009337290783,"snapshot":"INITIAL","snapshot_completed":true}status-connector-timescaledb-connector {"state":"UNASSIGNED","trace":null,"worker_id":"172.17.0.5:8083","generation":31}status-task-timescaledb-connector-0 {"state":"UNASSIGNED","trace":null,"worker_id":"172.17.0.5:8083","generation":31}status-connector-timescaledb-connector {"state":"RUNNING","trace":null,"worker_id":"172.17.0.5:8083","generation":33}status-task-timescaledb-connector-0 {"state":"RUNNING","trace":null,"worker_id":"172.17.0.5:8083","generation":33}
-
And that is it, you have configured Debezium to interact with TimescaleDB.