Kafka Deployment with Strimzi Operator and Envoy

This guide walks through the deployment of a production-ready Apache Kafka cluster on Kubernetes using the Strimzi Operator, complete with user authentication, RBAC permissions, and an Envoy proxy for external access.

Deliverables

  • High availability with 3 controllers and 3 brokers
  • User authentication with SCRAM-SHA-512
  • Fine-grained access control through ACLs
  • External access through an Envoy proxy
  • SSL/TLS is not setup to keep this exserse simple, this will be covered in another blog post

Table of Contents

Step 1: Install Strimzi Operator

First, install the Strimzi Kafka Operator using Helm:

helm repo add strimzi https://strimzi.io/charts/
helm repo update

helm install strimzi-kafka-operator strimzi/strimzi-kafka-operator 
  --namespace kafka 
  --create-namespace

 

This creates a dedicated kafka namespace and installs the Strimzi operator that will manage our Kafka resources.

Step 2: Deploy Kafka Cluster

Install Custom Resource Definitions (CRDs)

Apply the necessary CRDs that define Kafka-related resources:

# Install all CRDs
kubectl apply -f https://raw.githubusercontent.com/strimzi/strimzi-kafka-operator/refs/heads/main/install/cluster-operator/040-Crd-kafka.yaml
kubectl apply -f https://raw.githubusercontent.com/strimzi/strimzi-kafka-operator/refs/heads/main/install/cluster-operator/04A-Crd-kafkanodepool.yaml
kubectl apply -f https://raw.githubusercontent.com/strimzi/strimzi-kafka-operator/refs/heads/main/install/cluster-operator/043-Crd-kafkatopic.yaml
kubectl apply -f https://raw.githubusercontent.com/strimzi/strimzi-kafka-operator/refs/heads/main/install/cluster-operator/044-Crd-kafkauser.yaml

 

Setup Kafka Node Pools

Create a file named 10-nodepools.yaml with the following content:

apiVersion: kafka.strimzi.io/v1beta2
kind: KafkaNodePool
metadata:
  name: controllers
  namespace: kafka
  labels:
    strimzi.io/cluster: mkbits-strimzi-cluster01
spec:
  replicas: 3
  roles:
    - controller
  storage:
    type: persistent-claim
    class: longhorn
    size: 10Gi
    deleteClaim: false
---
apiVersion: kafka.strimzi.io/v1beta2
kind: KafkaNodePool
metadata:
  name: brokers
  namespace: kafka
  labels:
    strimzi.io/cluster: mkbits-strimzi-cluster01
spec:
  replicas: 3
  roles:
    - broker
  storage:
    type: persistent-claim
    class: longhorn
    size: 20Gi
    deleteClaim: false

 

 

This creates

  • 3 Kafka controller nodes with 10GB storage each
  • 3 Kafka broker nodes with 20GB storage each
  • Using the longhorn storage class for persistence

 

Apply the node pools configuration:

kubectl apply -f 10-nodepools.yaml

 

Create the Kafka Cluster

Create a file named 20-kafka.yaml with the following content:

apiVersion: kafka.strimzi.io/v1beta2
kind: Kafka
metadata:
  name: mkbits-strimzi-cluster01
  namespace: kafka
  annotations:
    strimzi.io/kraft: "enabled"
    strimzi.io/node-pools: "enabled"
spec:
  kafka:
    version: 3.9.0
    config:
      inter.broker.protocol.version: "3.9"
      log.message.format.version:  "3.9"
      offsets.topic.replication.factor: 3
      transaction.state.log.replication.factor: 3
      transaction.state.log.min.isr: 2
    listeners:
      - name: tls
        port: 9093
        type: internal
        tls: true
        authentication:
          type: scram-sha-512
      - name: plain
        port: 9092
        type: internal
        tls: false
        authentication:
          type: scram-sha-512
    authorization:
      type: simple
  entityOperator:
    topicOperator: {}
    userOperator: {}
Important Details:
  • Uses Kafka version 3.9.0 with KRaft mode enabled (no ZooKeeper)
  • Configures both TLS (9093) and plain (9092) internal listeners
  • Both listeners use SCRAM-SHA-512 authentication
  • Simple authorization is enabled for access control
  • Topic and User operators are enabled for managing topics and users

Apply the Kafka cluster configuration:

kubectl apply -f 20-kafka.yaml

 

Step 3: Configure Users and Permissions

User Creation

Create the following YAML files for different user configurations:

30-users.yaml:

apiVersion: kafka.strimzi.io/v1beta2
kind: KafkaUser
metadata:
  name: kafka-prod-user
  namespace: kafka
  labels:
    strimzi.io/cluster: mkbits-strimzi-cluster01
spec:
  authentication:
    type: scram-sha-512
  authorization:
    type: simple
    acls:
      - resource:
          type: topic
          name: prod_Topic01
          patternType: literal
        operation: All
      - resource:
          type: topic
          name: prod_Topic02
          patternType: literal
        operation: All
---   
apiVersion: kafka.strimzi.io/v1beta2
kind: KafkaUser
metadata:
  name: kafka-dev-user
  namespace: kafka
  labels:
    strimzi.io/cluster: mkbits-strimzi-cluster01
spec:
  authentication:
    type: scram-sha-512
  authorization:
    type: simple
    acls:
      - resource:
          type: topic
          name: dev_Topic01
          patternType: literal
        operation: All

 Apply each user configuration:

kubectl apply -f 30-users.yaml

 

Retrieving User Credentials

Strimzi stores user credentials in Kubernetes secrets. Retrieve them with:

kubectl get secret <username> -n kafka -o jsonpath="{.data.password}" | base64 --decode

 

Example:

kubectl get secret kafka-prod-user -n kafka -o jsonpath="{.data.password}" | base64 --decode

 

Step 4: Create Topics

40-KafkaTopic.yaml

# topics-bundle.yaml
apiVersion: kafka.strimzi.io/v1beta2
kind: KafkaTopic
metadata:
  name: prod_Topic01
  namespace: kafka
  labels:
    strimzi.io/cluster: mkbits-strimzi-cluster01
spec:
  partitions: 6
  replicas: 3
---
apiVersion: kafka.strimzi.io/v1beta2
kind: KafkaTopic
metadata:
  name: prod_Topic02
  namespace: kafka
  labels:
    strimzi.io/cluster: mkbits-strimzi-cluster01
spec:
  partitions: 3
  replicas: 3
  config:
    cleanup.policy: delete             # ordinary log-retention-Default-7-Days
---
apiVersion: kafka.strimzi.io/v1beta2
kind: KafkaTopic
metadata:
  name: dev_Topic01
  namespace: kafka
  labels:
    strimzi.io/cluster: mkbits-strimzi-cluster01
spec:
  partitions: 3
  replicas: 3
  config:
    retention.ms: 86400000             # 1-day

 

Step 5: Deploy Envoy as a Kafka-Aware Proxy

Envoy serves as a protocol-aware proxy for Kafka, enabling:

  • Centralized connection handling
  • Reduced NAT complexity
  • External access to the Kafka cluster
  • Advanced routing and observability

Understanding Kafka DNS in Kubernetes

Strimzi creates headless services for Kafka brokers. In Kubernetes, pod DNS follows this format:

<pod-name>.<headless-service>.<namespace>.svc.cluster.local

 

For our Strimzi deployment, the elements are:

ComponentPatternExample
Pod name<cluster>-<pool>-<ordinal>mkbits-strimzi-cluster01-brokers-0
Headless service<cluster>-kafka-brokersmkbits-strimzi-cluster01-kafka-brokers

This gives us the following broker FQDNs:

mkbits-strimzi-cluster01-brokers-0.mkbits-strimzi-cluster01-kafka-brokers.kafka.svc.cluster.local
mkbits-strimzi-cluster01-brokers-1.mkbits-strimzi-cluster01-kafka-brokers.kafka.svc.cluster.local
mkbits-strimzi-cluster01-brokers-2.mkbits-strimzi-cluster01-kafka-brokers.kafka.svc.cluster.local

 

Creating Envoy Configuration

Create a file named envoy-config.yaml:

apiVersion: v1
kind: ConfigMap
metadata:
  name: envoy-config
  namespace: kafka
data:
  envoy.yaml: |
    static_resources:
      listeners:
        - name: kafka_listener
          address:
            socket_address:
              address: 0.0.0.0
              port_value: 9094
          filter_chains:
            - filters:
                - name: envoy.filters.network.kafka_broker
                  typed_config:
                    "@type": type.googleapis.com/envoy.extensions.filters.network.kafka_broker.v3.KafkaBroker
                    stat_prefix: kafka
                    id_based_broker_address_rewrite_spec:
                      rules:
                        - id: 0
                          host: kafka-prod-eastus01.multicastbits.com
                          port: 9094
                        - id: 1
                          host: kafka-prod-eastus01.multicastbits.com
                          port: 9094
                        - id: 2
                          host: kafka-prod-eastus01.multicastbits.com
                          port: 9094
                - name: envoy.filters.network.tcp_proxy
                  typed_config:
                    "@type": type.googleapis.com/envoy.extensions.filters.network.tcp_proxy.v3.TcpProxy
                    stat_prefix: tcp
                    cluster: kafka_cluster
      clusters:
        - name: kafka_cluster
          connect_timeout: 1s
          type: strict_dns
          lb_policy: round_robin
          load_assignment:
            cluster_name: kafka_cluster
            endpoints:
              - lb_endpoints:
                  - endpoint:
                      address:
                        socket_address:
                          address: mkbits-strimzi-cluster01-brokers-0.mkbits-strimzi-cluster01-kafka-brokers.kafka.svc.cluster.local
                          port_value: 9092
                  - endpoint:
                      address:
                        socket_address:
                          address: mkbits-strimzi-cluster01-brokers-1.mkbits-strimzi-cluster01-kafka-brokers.kafka.svc.cluster.local
                          port_value: 9092
                  - endpoint:
                      address:
                        socket_address:
                          address: mkbits-strimzi-cluster01-brokers-2.mkbits-strimzi-cluster01-kafka-brokers.kafka.svc.cluster.local
                          port_value: 9092
    admin:
      access_log_path: /dev/null
      address:
        socket_address:
          address: 0.0.0.0
          port_value: 9901

Key Configuration Points:

  • Exposes an admin interface on port 9901
  • Listens on port 9094 for Kafka traffic
  • Uses the Kafka broker filter to rewrite broker addresses to an external hostname
  • Establishes upstream connections to all Kafka brokers on port 9092

Apply the ConfigMap:

kubectl apply -f envoy-config.yaml

 

Deploying Envoy

Create a file named envoy-deployment.yaml:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: envoy
  namespace: kafka
spec:
  replicas: 1
  selector:
    matchLabels:
      app: envoy
  template:
    metadata:
      labels:
        app: envoy
    spec:
      containers:
        - name: envoy
          image: envoyproxy/envoy-contrib:v1.25-latest
          args:
            - "-c"
            - "/etc/envoy/envoy.yaml"
          ports:
            - containerPort: 9094
            - containerPort: 9901
          volumeMounts:
            - name: envoy-config
              mountPath: /etc/envoy
              readOnly: true
      volumes:
        - name: envoy-config
          configMap:
            name: envoy-config

 

Apply the Envoy deployment:

kubectl apply -f envoy-deployment.yaml

 

Exposing Envoy Externally

Create a file named envoy-service.yaml:

apiVersion: v1
kind: Service
metadata:
  name: envoy
  namespace: kafka
spec:
  type: LoadBalancer
  selector:
    app: envoy
  ports:
    - name: kafka
      port: 9094
      targetPort: 9094
    - name: admin
      port: 9901
      targetPort: 9901

 

Apply the service:

kubectl apply -f envoy-service.yaml

 

Maintenance and Verification

If you need to update the Envoy configuration later:

kubectl -n kafka apply -f envoy-config.yaml
kubectl -n kafka rollout restart deployment/envoy

 

To verify your deployment:

  1. Check that all pods are running:
    kubectl get pods -n kafka

     

  2. Get the external IP assigned to your Envoy service:
    kubectl get service envoy -n kafka

     

  3. Test connectivity using a Kafka client with the external address and retrieved user credentials.

 

Checking the health via envoy admin interface

 

http://kafka-prod-eastus01.multicastbits.com:9901/clusters

http://kafka-prod-eastus01.multicastbits.com:9901/clusters

http://kafka-prod-eastus01.multicastbits.com:9901/readyhttp://kafka-prod-eastus01.multicastbits.com:9901/stats?filter=kafka

 

Use Mailx to send emails using office 365

just something that came up while setting up a monitoring script using mailx, figured ill note it down here so i can get it to easily later when I need it 😀

Important prerequisites

  • You need to enable smtp basic Auth on Office 365 for the account used for authentication
  • Create an App password for the user account
  • nssdb folder must be available and readable by the user running the mailx command

Assuming all of the above prerequisite are $true we can proceed with the setup

Install mailx

RHEL/Alma linux

sudo dnf install mailx

NSSDB Folder

make sure the nssdb folder must be available and readable by the user running the mailx command

certutil -L -d /etc/pki/nssdb

The Output might be empty, but that’s ok; this is there if you need to add a locally signed cert or another CA cert manually, Microsoft Certs are trusted by default if you are on an up to date operating system with the local System-wide Trust Store

Reference – RHEL-sec-shared-system-certificates

Configure Mailx config file

sudo nano /etc/mail.rc

Append/prepend the following lines and Comment out or remove the same lines already defined on the existing config files

set smtp=smtp.office365.com
set smtp-auth-user=###[email protected]###
set smtp-auth-password=##Office365-App-password#
set nss-config-dir=/etc/pki/nssdb/
set ssl-verify=ignore
set smtp-use-starttls
set from="###[email protected]###"

This is the bare minimum needed other switches are located here – link

Testing

echo "Your message is sent!" | mailx -v -s "test" [email protected]

-v switch will print the verbos debug log to console

Connecting to 52.96.40.242:smtp . . . connected.
220 xxde10CA0031.outlook.office365.com Microsoft ESMTP MAIL Service ready at Sun, 6 Aug 2023 22:14:56 +0000
>>> EHLO vls-xxx.multicastbits.local
250-MN2PR10CA0031.outlook.office365.com Hello [167.206.57.122]
250-SIZE 157286400
250-PIPELINING
250-DSN
250-ENHANCEDSTATUSCODES
250-STARTTLS
250-8BITMIME
250-BINARYMIME
250-CHUNKING
250 SMTPUTF8
>>> STARTTLS
220 2.0.0 SMTP server ready
>>> EHLO vls-xxx.multicastbits.local
250-xxde10CA0031.outlook.office365.com Hello [167.206.57.122]
250-SIZE 157286400
250-PIPELINING
250-DSN
250-ENHANCEDSTATUSCODES
250-AUTH LOGIN XOAUTH2
250-8BITMIME
250-BINARYMIME
250-CHUNKING
250 SMTPUTF8
>>> AUTH LOGIN
334 VXNlcm5hbWU6
>>> Zxxxxxxxxxxxc0BmdC1zeXMuY29t
334 UGsxxxxxmQ6
>>> c2Rxxxxxxxxxxducw==
235 2.7.0 Authentication successful
>>> MAIL FROM:<###[email protected]###>
250 2.1.0 Sender OK
>>> RCPT TO:<[email protected]>
250 2.1.5 Recipient OK
>>> DATA
354 Start mail input; end with <CRLF>.<CRLF>
>>> .
250 2.0.0 OK <[email protected]> [Hostname=Bsxsss744.namprd11.prod.outlook.com]
>>> QUIT
221 2.0.0 Service closing transmission channel 

Now you can use this in your automation scripts or timers using the mailx command

#!/bin/bash

log_file="/etc/app/runtime.log"
recipient="[email protected]"
subject="Log file from /etc/app/runtime.log"

# Check if the log file exists
if [ ! -f "$log_file" ]; then
  echo "Error: Log file not found: $log_file"
  exit 1
fi

# Use mailx to send the log file as an attachment
echo "Sending log file..."
mailx -s "$subject" -a "$log_file" -r "[email protected]" "$recipient" < /dev/null
echo "Log file sent successfully."

Secure it

sudo chown root:root /etc/mail.rc
sudo chmod 600 /etc/mail.rc

The above commands change the file’s owner and group to root, then set the file permissions to 600, which means only the owner (root) has read and write permissions and other users have no access to the file.

Use Environment Variables: Avoid storing sensitive information like passwords directly in the mail.rc file, consider using environment variables for sensitive data and reference those variables in the configuration.

For example, in the mail.rc file, you can set:

set smtp-auth-password=$MY_EMAIL_PASSWORD

You can set the variable using another config file or store it in the Ansible vault during runtime or use something like Hashicorp.

Sure, I would just use Python or PowerShell core, but you will run into more locked-down environments like OCI-managed DB servers with only Mailx is preinstalled and the only tool you can use 🙁

the Fact that you are here means you are already in the same boat. Hope this helped… until next time

Solution – RKE Cluster MetalLB provides Services with IP Addresses but doesn’t ARP for the address

I ran in to the the same issue detailed here working with a RKE cluster

https://github.com/metallb/metallb/issues/1154

After looking around for a few hours digging in to the logs i figured out the issue, hopefully this helps some one else our there in the situation save some time.

Make sure the IPVS mode is enabled on the cluster configuration

If you are using :

RKE2 – edit the cluster.yaml file

RKE1 – Edit the cluster configuration from the rancher UI > Cluster management > Select the cluster > edit configuration > edit as YAML

Locate the services field under rancher_kubernetes_engine_config and add the following options to enable IPVS

    kubeproxy:
      extra_args:
        ipvs-scheduler: lc
        proxy-mode: ipvs

https://www.suse.com/support/kb/doc/?id=000020035

Default

After changes

Make sure the Kernel modules are enabled on the nodes running control planes

Background

Example Rancher – RKE1 cluster

sudo docker ps | grep proxy # find the container ID for kubproxy

sudo docker logs ####containerID###

0313 21:44:08.315888  108645 feature_gate.go:245] feature gates: &{map[]}
I0313 21:44:08.346872  108645 proxier.go:652] "Failed to load kernel module with modprobe, you can ignore this message when kube-proxy is running inside container without mounting /lib/modules" moduleName="nf_conntrack_ipv4"
E0313 21:44:08.347024  108645 server_others.go:107] "Can't use the IPVS proxier" err="IPVS proxier will not be used because the following required kernel modules are not loaded: [ip_vs_lc]"

Kubproxy is trying to load the needed kernel modules and failing to enable IPVS

Lets enable the kernel modules

sudo nano /etc/modules-load.d/ipvs.conf

ip_vs_lc
ip_vs
ip_vs_rr
ip_vs_wrr
ip_vs_sh
nf_conntrack_ipv4

Install ipvsadm to confirm the changes

sudo dnf install ipvsadm -y

Reboot the VM or the Baremetal server

use the sudo ipvsadm to confirm ipvs is enabled

sudo ipvsadm

Testing

kubectl get svc -n #namespace | grep load
arping -I ens192 192.168.94.140
ARPING 192.168.94.140 from 192.168.94.65 ens192
Unicast reply from 192.168.94.140 [00:50:56:96:E3:1D] 1.117ms
Unicast reply from 192.168.94.140 [00:50:56:96:E3:1D] 0.737ms
Unicast reply from 192.168.94.140 [00:50:56:96:E3:1D] 0.845ms
Unicast reply from 192.168.94.140 [00:50:56:96:E3:1D] 0.668ms
Sent 4 probes (1 broadcast(s))
Received 4 response(s)

If you have the service type load balancer on a deployment now you should be able to reach it if the container is responding on the service

helpful Links

https://metallb.universe.tf/configuration/troubleshooting/

https://github.com/metallb/metallb/issues/1154

https://github.com/rancher/rke2/issues/3710

How to extend root (cs-root) Filesystem using LVM Cent OS/RHEL/Almalinux

This guide will walk you through on how to extend and increase space for the root filesystem on a alma linux. Cent OS, REHL Server/Desktop/VM

Method A – Expanding the current disk

Edit the VM and Add space to the Disk

install the cloud-utils-growpart package, as the growpart command in it makes it really easy to extend partitioned virtual disks.

sudo dnf install cloud-utils-growpart

Verify that the VM’s operating system recognizes the new increased size of the sda virtual disk, using lsblk or fdisk -l

sudo fdisk -l
Notes -
Note down the disk id and the partition number for Linux LVM - in this demo disk id is sda and lvm partition is sda 3

lets trigger a rescan of a block devices (Disks)

#elevate to root
sudo su 

#trigger a rescan, Make sure to match the disk ID you noted down before 
echo 1 > /sys/block/sda/device/rescan
exit

Now sudo fdisk -l shows the correct size of the disks

Use growpart to increase the partition size for the lvm

sudo growpart /dev/sda 3

Confirm the volume group name

sudo vgs

Extend the logical volume

sudo lvextend -l +100%FREE /dev/almalinux/root

Grow the file system size

sudo xfs_growfs /dev/almalinux/root
Notes -
You can use this same steps to add space to different partitions such as home, swap if needed

Method B -Adding a second Disk to the LVM and expanding space

Why add a second disk?
may be the the current Disk is locked due to a snapshot and you cant remove it, Only solution would be to add a second disk/

Check the current space available

sudo df -h 
Notes -
If you have 0% ~1MB left on the cs-root command auto-complete with tab and some of the later commands wont work, You should clear up atleast 4-10mb by clearing log files, temp files, etc

Mount an additional disk to the VM (Assuming this is a VM) and make sure the disk is visible on the OS level

sudo lvmdiskscan

OR

sudo fdisk -l

Confirm the volume group name

sudo vgs

Lets increase the space

First lets initialize the new disk we mounted

sudo mkfs.xfs /dev/sdb

Create the Physical volume

sudo pvcreate /dev/sdb

extend the volume group

sudo vgextend cs /dev/sdb
  Volume group "cs" successfully extended


Extend the logical volume

sudo lvextend -l +100%FREE /dev/cs/root

Grow the file system size

sudo xfs_growfs /dev/cs/root

Confirm the changes

sudo df -h

Just making easy for us!!

#Method A - Expanding the current disk 
#AlmaLinux
sudo dnf install cloud-utils-growpart

sudo lvmdiskscan
sudo fdisk -l                          #note down the disk ID and partition num


sudo su                                #elevate to root
echo 1 > /sys/block/sda/device/rescan  #trigger a rescan
exit                                   #exit root shell

sudo lvextend -l +100%FREE /dev/almalinux/root
sudo xfs_growfs /dev/almalinux/root
sudo df -h

#Method B - Adding a second Disk 
#CentOS

sudo lvmdiskscan
sudo fdisk -l
sudo vgs
sudo mkfs.xfs /dev/sdb
sudo pvcreate /dev/sdb
sudo vgextend cs /dev/sdb
sudo lvextend -l +100%FREE /dev/cs/root
sudo xfs_growfs /dev/cs/root
sudo df -h

#AlmaLinux

sudo lvmdiskscan
sudo fdisk -l
sudo vgs
sudo mkfs.xfs /dev/sdb
sudo pvcreate /dev/sdb
sudo vgextend almalinux /dev/sdb
sudo lvextend -l +100%FREE /dev/almalinux/root
sudo xfs_growfs /dev/almalinux/root
sudo df -h