Basic quick and dirty Kafka test cluster

Here is how to get a Kafka test cluster going. Not for production, of course. This is a 3 node cluster running embedded zookeeper. The whole thing was done on CentOS 7.

Start off with setting up JAVA_HOME by create /etc/profile.d/java.sh:

export JAVA_HOME=$(dirname $(dirname $(readlink $(readlink $(which javac)))))

Create Kafka service account and lock it:

[root@kafka1 ~]# groupadd -r kafka
[root@kafka1 ~]# useradd -r -g kafka -d /opt/kafka -s /bin/bash -c "Apache Kafka" kafka
[root@kafka1 ~]# passwd -l kafka

Grab kafka binaries and decompress them:

[root@kafka1 opt]# tar -zxvf kafka_2.11-2.0.0.tar.gz
[root@kafka1 opt]# ln -s kafka_2.11-2.0.0 kafka
[root@kafka1 opt]# chown -R kafka:kafka ./kafka

In /usr/lib/systemd/system create kafka.zookeeper.service:

[Unit]
Description=Apache Zookeeper server (Kafka)
Documentation=http://zookeeper.apache.org
Requires=network.target remote-fs.target
After=network.target remote-fs.target

[Service]
Type=simple
Restart=always
RestartSec=3
User=kafka
Group=kafka
WorkingDirectory=/opt/kafka
ExecStart=/opt/kafka/bin/zookeeper-server-start.sh /opt/kafka/config/zookeeper.properties
ExecStop=/opt/kafka/bin/zookeeper-server-stop.sh

[Install]
WantedBy=multi-user.target

followed by kafka.broker.service:

[Unit]
Description=Apache Kafka server (broker)
Documentation=http://kafka.apache.org/documentation.html
Requires=network.target remote-fs.target
After=network.target remote-fs.target kafka-zookeeper.service

[Service]
Type=simple
Restart=always
RestartSec=3
User=kafka
Group=kafka
WorkingDirectory=/opt/kafka
ExecStart=/opt/kafka/bin/kafka-server-start.sh /opt/kafka/config/server.properties
ExecStop=/opt/kafka/bin/kafka-server-stop.sh

[Install]
WantedBy=multi-user.target

Now create directories for zookeeper data and kafka logs:

[root@kafka1 opt]# mkdir -p kafkadata/zookeeper
[root@kafka1 opt]# mkdir -p kafkadata/logs
[root@kafka1 opt]# chown -R kafka:kafka ./kafkadata/

Next, add the following lines to zookeeper.properties:

dataDir=/opt/kafkadata/zookeeper
# the port at which the clients will connect
clientPort=9999
# disable the per-ip limit on the number of connections since this is a non-production config
maxClientCnxns=0
server.1=kafka1.example.com:4888:5888
server.2=kafka1.example.com:4888:5888
server.3=kafka1.example.com:4888:5888
initLimit=5
syncLimit=2

You will notice, that the default ports were changed. This is simply due to the fact that the same machines have Apache Pulsar running on them, along with KOP. To avoid conflicts different ports had to be used for Kafka.

Proceed to create myid files for zookeeper. Remember, each node must have a different id in myid file. So, for kafka1.example.com:

[root@kafka1 opt]# echo 1 > /opt/kafkadata/zookeeper/myid
[root@kafka1 opt]# chown kafka:kafka /opt/kafkadata/zookeeper/myid

The actual value inside myid file does not matter. It just has to be unique for each node.

Finally, make some basic changes to server.properties. The following lines are valid for kafka1.example.com. broker.id parameter is a similar story. It has to be unique among the nodes.

broker.id=1
log.dirs=/opt/kafkadata/logs
zookeeper.connect=broker1.example.com:9999,broker2.example.com:9999,broker3.example.com:9999
listeners=PLAINTEXT://:9095

That should be it. Now, startup Kafka and test it out:

[root@kafka1 opt]# systemctl enable kafka.zookeeper && systemctl start kafka.zookeeper
[root@kafka1 opt]# systemctl enable kafka.broker && systemctl start kafka.broker

Create a somedudetest test topic as follows:

[root@kafka1 bin]# ./kafka-topics.sh --create --zookeeper kafka1.example.com:9999,kafka2.example.com:9999,kafka3.example.com:9999 --replication-factor 3 --partitions 3 --topic somedudetest
Created topic "somedudetest".

Verify the test topic exists:

[root@kafka1 bin]# ./kafka-topics.sh --list --zookeeper kafka1.example.com:9999,kafka2.example.com:9999,kafka3.example.com:9999
[root@kafka1 bin]# somedudetest

Now start producing messages:

[root@kafka1 bin]# ./kafka-console-producer.sh --broker-list gva-004-pulsardev-1.interoute.net:9095,gva-004-pulsardev-2.interoute.net:9095,gva-004-pulsardev-3.interoute.net:9095 --topic somedudetest
>Some Dude is here!
>

In the second window run consume and you should see a message you had produced in previous step:

[root@kafka1 bin]# ./kafka-console-consumer.sh --from-beginning --bootstrap-server gva-004-pulsardev-1.interoute.net:9095,gva-004-pulsardev-2.interoute.net:9095,gva-004-pulsardev-3.interoute.net:9095 --topic somedudetest
Some Dude is here!

That’s it.