Apache Pulsar is distributed messaging system. I was doing some POC and here are instructions on how to get basic Pulsar cluster going. This was done on CentOS 7.

The official documentation is pretty decent, and the instructions below are distilled from it.

The setup is a three node Zookeeper cluster and three node broker cluster. The nodes are running CentOS 7 and Pulsar version is 2.5.1. First, there should be some DNS records:

zoo1.example.net      IN A
zoo2.example.net      IN A
zoo3.example.net      IN A
broker1.example.net   IN A
broker2.example.net   IN A
broker3.example.net   IN A
pulsar-cl.example.net IN A
pulsar-cl.example.net IN A
pulsar-cl.example.net IN A

The following steps are to be performed on all 6 systems.

Install Java:

[root@zoo1 ~]# yum install java-devel

Next, set $JAVA_HOME for the whole system by creating /etc/profile.d/java.sh with following content:

export JAVA_HOME=$(dirname $(dirname $(readlink $(readlink $(which javac)))))

Create service account and lock it:

[root@zoo1 ~]# groupadd -r pulsar
[root@zoo1 ~]# useradd -r -g pulsar -d /opt/pulsar -s /bin/bash -c "Apache Pulsar" pulsar
[root@zoo1 ~]# passwd -l pulsar

Decompress pulsar tarball and create symlink:

[root@zoo1 ~]# cd /opt
[root@zoo1 opt]# tar zxvf apache-pulsar-2.5.1-bin.tar.gz
[root@zoo1 opt]# ln -s apache-pulsar-2.5.1/ pulsar

Steps below need to be performed on the three zookeeper machines:

[root@zoo1 opt]# mkdir -p pulsardata/zookeeper
[root@zoo1 opt]# chown -R pulsar:pulsar pulsardata/

In /opt/pulsar/conf/zookeeper.conf make the following changes:


Now, each zookeeper server needs to have a unique ID. They do not necessarily have to be sequential, so for simplicity I used hostname index in /opt/pulsar/pulsardata/zookeeper/myid:

[root@zoo1 opt]# echo 1 > /opt/pulsar/pulsardata/zookeeper/myid
[root@zoo1 opt]# chown pulsar:pulsar /opt/pulsar/pulsardata/zookeeper/myid

Similarly, on zoo2 I would echo 2 into the myid file, and so on. Next, start zookeeper service. Note, that no systemd units are included in the tarball, so you have to make those yourself.

[root@zoo1 opt]# systemctl enable pulsar.zookeeper
[root@zoo1 opt]# systemctl start pulsar.zookeeper

Finally, initialize the zookeeper cluster. You only need to do this once on one machine in the cluster:

[root@zoo1 opt]# /opt/pulsar/bin/pulsar initialize-cluster-metadata --cluster pulsar-cl --zookeeper zoo1.example.net:2181 --configuration-store zoo1.example.net --web-service-url http://pulsar-cl.example.net:8080 --web-service-url-tls https://pulsar-cl.example.net:8443 --broker-service-url pulsar://pulsar-cl.example.net:6650 --broker-service-url-tls pulsar+ssl://pulsar-cl.example.net:6651

This concludes basic zookeeper setup. Now, onto remaining three broker nodes.

Make datadir for bookkeeper:

[root@broker1 opt]# mkdir -p pulsardata/bookkeeper
[root@broker1 opt]# chown -R pulsar:pulsar pulsardata/

In /opt/pulsar/conf/bookkeeper.conf specify zookeeper servers, optionally enable stateful function and set custom directories:


Now, you can start bookies, and again systemd units are not included in the Pulsar tarball:

[root@broker1 opt]# systemctl enable pulsar.bookkeeper
[root@broker1 opt]# systemctl start pulsar bookkeeper

Perform sanity check on broker nodes:

[root@broker1 opt]# /opt/pulsar/bin/bookkeeper shell bookiesanity

Finally, configure brokers. Set the following parameters in /opt/pulsar/conf/broker.conf:


Next, verify ports with the ones used during metadata initialization:


Enable Pulsar functions in /opt/pulsar/conf/functions_worker.yml:

pulsarFunctionsCluster: pulsar-cl

Finally, start brokers:

[root@broker1 opt]# systemctl enable pulsar.broker
[root@broker1 opt]# systemctl start pulsar.broker

One more thing, configure client utilities by setting the following parameters in client.conf:


This should result in working Pulsar cluster. There is no security or encryption set up. Unfortunately, the official docs are no complete when it comes to securing the individual components using SSL certificates. For now.