Strimzi and debezium deployment in kubernetes

I have been trying to do this since 3 years ago. I have been trying to deploy and run Apache kafka in kubernetes cluster and use debezium for CDC to source from Oracle database. At that time most of the kubernetes APIs are still in beta and debezium version is something like 0.1 beta. There was no kafka operator for kuberenetes. I have to do every kafka related deployment by hands. Kafka is a kind of stateful application and it is very difficult to manage in stateless nature of kubernetes without operator. Debezium support for Oracle was also in alpha stage. I have failed to connect Oracle database using debezium connector because of many bug in the connector plugin. I followed debezium mailing list to know updates and maturity of the project to try it later again. Now its latest version is 1.6 and support for Oracle DB also very much stable.

Last week I tried to do this setup again. I found the project Strimzi. It is kafka operator for kubernetes. Latest version is 0.25.0. Although it is still in version 0.* it works perfectly. Here I will show how it is easy to deploy kafka in kubernetes using Strimzi and how easy to setup/configure CDC using debezium.

You need to have all kubernetes related utilities such as kubectl, helm and oracle database with logminer or xstream support. If you would like to start with minimal setup, you can use minikube or kubernetes kind. Here is the steps you need to follow.

  1. Prepare Docker image for debezium oracle connector
  2. setup helm repository and pull helm chart to get sample values.yaml
  3. Update strimzi helm values
  4. Deploy strimzi operator and kafka cluster with updated values
  5. Deploy debezium oracle connector plugin cluster
  6. Prepare Oracle database
  7. Apply kafka connector configuration for debezium oracle connector
  8. verify and check data stream

Prepare Docker image for debezium oracle connector

Debezium oracle connector need jdbc.jar and xstream.jar from oracle database client library. You need to get it from oracle download site. Here is Dockerfile content to build debezium oracle connector container image.

FROM quay.io/strimzi/kafka:0.25.0-kafka-2.8.0

ENV KAFKA_HOME=/opt/kafka

USER root:root

RUN cd /tmp; \
    curl -LO https://download.oracle.com/otn_software/linux/instantclient/213000/instantclient-basic-linux.x64-21.3.0.0.0.zip; \
    unzip instantclient-basic-linux.x64-21.3.0.0.0.zip; \
    cp instantclient_21_3/* ${KAFKA_HOME}/libs; \
    rm -rf instantclient_21_3; \
    rm instantclient-basic-linux.x64-21.3.0.0.0.zip

##########
# Connector plugin debezium-oracle-connect
##########
RUN mkdir -p ${KAFKA_HOME}/plugins/debezium-oracle-connect/deaf1cc4 \
        && curl -L --output ${KAFKA_HOME}/plugins/debezium-oracle-connect/deaf1cc4.tgz https://repo1.maven.org/maven2/io/debezium/debezium-connector-oracle/1.6.1.Final/debezium-connector-oracle-1.6.1.Final-plugin.tar.gz \
        && echo "fe5eb4d0dda150b10d24a6d9f3a631c493267a0dee2d72167a8841af5804c43a908d149e5bc4a87dc48f0747e26a1d85a930225eea7b9709726ea2abda95b487 ${KAFKA_HOME}/plugins/debezium-oracle-connect/deaf1cc4.tgz" > ${KAFKA_HOME}/plugins/debezium-oracle-connect/deaf1cc4.tgz.sha512 \
        && sha512sum --check ${KAFKA_HOME}/plugins/debezium-oracle-connect/deaf1cc4.tgz.sha512 \
        && rm -f ${KAFKA_HOME}/plugins/debezium-oracle-connect/deaf1cc4.tgz.sha512 \
        && tar xvfz ${KAFKA_HOME}/plugins/debezium-oracle-connect/deaf1cc4.tgz -C ${KAFKA_HOME}/plugins/debezium-oracle-connect/deaf1cc4 \
        && rm -vf ${KAFKA_HOME}/plugins/debezium-oracle-connect/deaf1cc4.tgz

USER 1001

Or you can use prebuilt image from here “registry.gitlab.com/herzcthu/debezium-oracle:1.6.1”

setup helm repository and pull helm chart to get sample values.yaml

helm repo add strimzi https://strimzi.io/charts/
helm pull strimzi/strimzi-kafka-operator
tar -zxvf strimzi-kafka-operator-helm-3-chart-0.25.0.tgz
code strimzi-kafka-operator/values.yaml

Update strimzi helm values

At line number 5 update this to watch kafka deployments in “kafka” namespace

watchNamespaces
- kafka:

At line number 47 to 50, you need to update values to use previous built image

kafkaConnect:
  image:
    registry: registry.gitlab.com
    repository: herzcthu
    name: debezium-oracle
    tag: 1.6.1

No alt text provided for this image

Deploy strimzi operator and kafka cluster with updated values

Deploy strimzi operator

kubectl create ns strimzi
kubectl create ns kafka
helm install strimzi strimzi/strimzi-kafka-operator -n strimzi -f values.yaml

Download deployment file sample from https://strimzi.io/examples/latest/kafka/kafka-persistent-single.yaml

Change replicaset to 3 and change other parameters according to your requirements

kubectl apply -f kafka-persistent-single.yaml -n kafka
kubectl wait kafka/my-cluster --for=condition=Ready --timeout=300s -n kafka 

Deploy debezium oracle connector plugin cluster

You need to create container registry credentials as k8s secret if you are using private repository for debezium oracle connector image.

kubectl -nkafka create secret docker-registry regcred --docker-server=<your-registry-server> --docker-username=<your-name> --docker-password=<your-pword> --docker-email=<your-email>

Connector deployment file

apiVersion: kafka.strimzi.io/v1bet
kind: KafkaConnect
metadata:
  name: my-connect-cluster
  annotations:
#  # use-connector-resources configures this KafkaConnect
#  # to use KafkaConnector resources to avoid
#  # needing to call the Connect REST API directly
    strimzi.io/use-connector-resources: "true"
spec:
  version: 2.8.0
  replicas: 1
    image: "registry.gitlab.com/herzcthu/debezium-oracle:1.6.1"
    template:
      pod:
        imagePullSecrets:
          - name: regcred
    bootstrapServers: my-cluster-kafka-bootstrap:9093
    tls:
      trustedCertificates:
        - secretName: my-cluster-cluster-ca-cert
          certificate: ca.crt
    config:
      group.id: connect-cluster
      offset.storage.topic: connect-cluster-offsets
      config.storage.topic: connect-cluster-configs
      status.storage.topic: connect-cluster-status
      # -1 means it will use the default replication factor configured in the broker
      config.storage.replication.factor: -1
      offset.storage.replication.factor: -1
      status.storage.replication.factor: -1

Prepare Oracle database

This is a job for Oracle DBA. You can follow this documentation.

https://debezium.io/documentation/reference/connectors/oracle.html#_preparing_the_database

Apply kafka connector configuration for debezium oracle connector

apiVersion: kafka.strimzi.io/v1beta2
kind: KafkaConnector
metadata:
  name: debezium-oracle-server1
  labels:
    strimzi.io/cluster: debezium-oracle
spec:
  class: io.debezium.connector.oracle.OracleConnector
  tasksMax: 1
  config:
    database.server.name: "server1"
    database.hostname: "192.168.1.1"
    database.port: "1521"
    database.user: "dbzuser"
    database.password: "dbz"
    database.dbname: "DBZDB"
    # broker port 9092 is plain text and 9093 is for SSL/TLS
    database.history.kafka.bootstrap.servers: "my-cluster-kafka-bootstrap:9092"
    database.history.kafka.topic: "schema-changes.collections"
    schema.include.list: "SCHEMA_NAME"
    table.include.list: "SCHEMA_NAME.TABLE_NAME"

verify and check data stream

Check configuration

kubectl -n kafka run connect-cluster-configs -ti --image=quay.io/strimzi/kafka:0.25.0-kafka-2.8.0 --rm=true --restart=Never -- bin/kafka-console-consumer.sh --bootstrap-server my-cluster-kafka-bootstrap:9092 --topic connect-cluster-configs --from-beginning

Check connector status

kubectl -n kafka run connect-cluster-status -ti --image=quay.io/strimzi/kafka:0.25.0-kafka-2.8.0 --rm=true --restart=Never -- bin/kafka-console-consumer.sh --bootstrap-server my-cluster-kafka-bootstrap:9092 --topic connect-cluster-status --from-beginning

Check data streaming. Topic name format is SERVER_NAME.SCHEMA_NAME.TABLE_NAME

kubectl -n kafka run kafka-consumer -ti --image=quay.io/strimzi/kafka:0.25.0-kafka-2.8.0 --rm=true --restart=Never -- bin/kafka-console-consumer.sh --bootstrap-server my-cluster-kafka-bootstrap:9092 --topic server1.SCHEMA_NAME.TABLE_NAME --from-beginning

References:

  • https://github.com/strimzi/strimzi-kafka-operator/tree/0.25.0/helm-charts/helm3/strimzi-kafka-operator
  • https://debezium.io/documentation/reference/connectors/oracle.html

Leave a comment

Your email address will not be published.

This site uses Akismet to reduce spam. Learn how your comment data is processed.