Exposing Kafka on two different networks

Exposing Kafka on two different networks

Do you like our work......we hire!

Never miss our publications about Open Source, big data and distributed systems, low frequency of one email every two months.

A Big Data setup usually requires you to have multiple networking interface, let’s see how to set up Kafka on more than one of them. Kafka is a open-source stream processing software platform system wich functions like a publish/subscribe distributed messaging. It is designed for high throughput with built-in partitioning, replication, and fault tolerance.

This article was implemented using CDH 5.7.1 with Kafka 2.0.1.5 installed using parcels.

One of the clusters we are working on has the following network configuration:

  • A “data” network exposing our edge, Kafka and master nodes to the outside world
  • An “internal” network dedicated to the cluster for our worker nodes

We use Kafka for data ingestion and also to send processed data to another system exposing UIs for the analysts so we have:

  • A Spark Streaming job consuming Kafka topics from YARN (our “internal” network)
  • The other system’s app consuming Kafka topics from the outside (our “data” network)

Thus, Kafka must be available on two different networks. To do so, the following configuration must be applied on each Kafka broker in the kafka.properties safety valve input and the Kafka nodes must share the same hostname on both networks:

listeners=PLAINTEXT://0.0.0.0:9092
advertised.listeners=PLAINTEXT://<hostname>:9092

That’s it!

NB: Kafka is listening on every interface instead of just the one you need. Supposedly, Kafka accepts the following configuration to set specific IP addresses:

listeners=PLAINTEXT://<ip1>:9092,PLAINTEXT://<ip2>:9092
advertised.listeners=PLAINTEXT://<hostname>:9092

however, it will throw this exception on startup:

java.lang.IllegalArgumentException: requirement failed: Each listener must have a different port
  at scala.Predef$.require(Predef.scala:219)
  at kafka.server.KafkaConfig.validateUniquePortAndProtocol(KafkaConfig.scala:905)
  at kafka.server.KafkaConfig.getListeners(KafkaConfig.scala:913)
  at kafka.server.KafkaConfig.<init>(KafkaConfig.scala:866)
  at kafka.server.KafkaConfig$.fromProps(KafkaConfig.scala:698)
  at kafka.server.KafkaConfig$.fromProps(KafkaConfig.scala:695)
  at kafka.server.KafkaServerStartable$.fromProps(KafkaServerStartable.scala:28)
  at kafka.Kafka$.main(Kafka.scala:58)
  at com.cloudera.kafka.wrap.Kafka$.main(Kafka.scala:76)
  at com.cloudera.kafka.wrap.Kafka.main(Kafka.scala)

and a variation of ”Each listener must have a different protocol” when changing the ports.

Share this article

Canada - Morocco - France

We are a team of Open Source enthusiasts doing consulting in Big Data, Cloud, DevOps, Data Engineering, Data Science…

We provide our customers with accurate insights on how to leverage technologies to convert their use cases to projects in production, how to reduce their costs and increase the time to market.

If you enjoy reading our publications and have an interest in what we do, contact us and we will be thrilled to cooperate with you.

Support Ukrain