Installing Apache Kafka on AlmaLinux 9

By Chandrashekhar Fakirpure

Updated on Jun 02, 2024

In this tutorial, we're installing Apache Kafka on AlmaLinux 9. It is a distributed streaming platform that is widely used for building real-time data pipeline.

Apache Kafka is designed for high-throughput, low-latency data streaming and is widely used for building real-time data pipelines and streaming applications. Kafka enables you to publish, subscribe to, store, and process streams of records in a distributed and fault-tolerant manner, making it a popular choice for organizations dealing with large-scale, real-time data feeds.

Prerequisites

Before starting, ensure you have the following:

  • An AlmaLinux 9 dedicated server with a non-root user with sudo privileges.
  • Java Development Kit (JDK) installed on your server.
  • At least 2GB of RAM.

Step 1: Update the System

Start by updating the package list and upgrading the system packages to the latest versions.

sudo dnf update -y

Step 2: Install Java

Kafka requires Java to run. Install the latest version of OpenJDK available in the AlmaLinux repositories.

sudo dnf install java-21-openjdk-devel -y

(Optional) Change current Java version

If you have already installed different version of Java, you can change it using following command:

update-alternatives --config java

Select which Java version you want to set by entering number and hit enter.

Verify the installation:

java -version

You should see an output similar to:

openjdk version "21.0.3" 2024-04-16 LTS
OpenJDK Runtime Environment (Red_Hat-21.0.3.0.9-1) (build 21.0.3+9-LTS)
OpenJDK 64-Bit Server VM (Red_Hat-21.0.3.0.9-1) (build 21.0.3+9-LTS, mixed mode, sharing)

Step 3: Create Kafka User

For security reasons, it’s a good practice to create a dedicated user for Kafka.

sudo useradd -m -s /bin/bash kafka
sudo passwd kafka

Switch to the Kafka user:

sudo su - kafka

Step 4: Download and Extract Kafka

Download the latest stable version of Kafka from the official Apache Kafka download page.

wget https://downloads.apache.org/kafka/3.7.0/kafka_2.13-3.7.0.tgz
tar -xzf kafka_2.13-3.7.0.tgz
mv kafka_2.13-3.7.0 kafka

Step 5: Configure Kafka

Kafka requires Zookeeper, which comes bundled with Kafka for development and testing purposes. In a production environment, you should set up a dedicated Zookeeper cluster.

Configure Zookeeper

Create a data directory for Zookeeper:

mkdir -p ~/kafka/data/zookeeper

Edit the Zookeeper configuration file:

vim ~/kafka/config/zookeeper.properties

Update the dataDir property to point to the new data directory:

dataDir=/home/kafka/kafka/data/zookeeper

Configure Kafka Broker

Create a data directory for Kafka:

mkdir -p ~/kafka/data/kafka

Edit the Kafka configuration file:

vim ~/kafka/config/server.properties

Update the following properties:

log.dirs=/home/kafka/kafka/data/kafka
zookeeper.connect=localhost:2181

Step 6: Start Zookeeper and Kafka

Open two terminal sessions: one for Zookeeper and another for Kafka. Ensure you are logged in as the Kafka user in both.

Start Zookeeper

~/kafka/bin/zookeeper-server-start.sh ~/kafka/config/zookeeper.properties

Start Kafka

In the second terminal session, start Kafka:

~/kafka/bin/kafka-server-start.sh ~/kafka/config/server.properties

Step 7: Testing the Installation

Create a Topic

In a new terminal session, still logged in as the Kafka user, create a test topic:

~/kafka/bin/kafka-topics.sh --create --topic test --bootstrap-server localhost:9092 --partitions 1 --replication-factor 1

List Topics

Verify the topic was created:

~/kafka/bin/kafka-topics.sh --list --bootstrap-server localhost:9092

Produce Messages

Start a Kafka producer:

~/kafka/bin/kafka-console-producer.sh --topic test --bootstrap-server localhost:9092

Type a few messages and hit Enter after each:

Hello Kafka
This is a test message

Consume Messages

Open another terminal session and start a Kafka consumer:

~/kafka/bin/kafka-console-consumer.sh --topic test --from-beginning --bootstrap-server localhost:9092

You should see the messages you typed in the producer terminal.

(Optional) Set SELinux Enforcing

If you have enabled SELinux follow this step. In order start Kafka and Zookeeper service, we need to set SELinux to Enforcing. We haven't found SELinux configuration for those service. So, we're setting it to enforcing. Otherwise we will face Permission Denied error.

sudo setenforce 0

Step 8: Setting Up Kafka as a Systemd Service

To ensure Kafka and Zookeeper start on boot, you can set them up as systemd services.

Create a new systemd service file for Zookeeper:

sudo vim /etc/systemd/system/zookeeper.service

Add the following content:

[Unit]
Description=Apache Zookeeper server
Documentation=http://zookeeper.apache.org
After=network.target

[Service]
Type=simple
User=kafka
ExecStart=/home/kafka/kafka/bin/zookeeper-server-start.sh /home/kafka/kafka/config/zookeeper.properties
ExecStop=/home/kafka/kafka/bin/zookeeper-server-stop.sh
Restart=on-abnormal

[Install]
WantedBy=multi-user.target

Create Kafka Systemd Service

Create a new systemd service file for Kafka:

sudo vim /etc/systemd/system/kafka.service

Add the following content:

[Unit]
Description=Apache Kafka server
Documentation=http://kafka.apache.org/documentation.html
After=network.target zookeeper.service

[Service]
Type=simple
User=kafka
ExecStart=/home/kafka/kafka/bin/kafka-server-start.sh /home/kafka/kafka/config/server.properties
ExecStop=/home/kafka/kafka/bin/kafka-server-stop.sh
Restart=on-abnormal

[Install]
WantedBy=multi-user.target

Start and Enable the Services

Reload systemd to apply the new service files:

sudo systemctl daemon-reload

Start and enable Zookeeper:

sudo systemctl start zookeeper
sudo systemctl enable zookeeper

Start and enable Kafka:

sudo systemctl start kafka
sudo systemctl enable kafka

Conclusion

You have now successfully seen how to install Apache Kafka on AlmaLinux 9. You can create topics, produce and consume messages, and manage Kafka and Zookeeper as systemd services. This setup provides a robust foundation for building real-time data pipelines and streaming applications.

For further configuration and tuning, refer to the official Kafka documentation.