Advanced multi-tenant Hadoop and Zookeeper protection

Advanced multi-tenant Hadoop and Zookeeper protection

By Pierre SAUVAGE

Jul 5, 2017

Zookeeper is a critical component to Hadoop’s high availability operation. The latter protects itself by limiting the number of maximum connections (maxConns = 400). However Zookeeper does not protect himself intelligently, he refuses connections once the threshold is reached. In such case, the core components (HBase RegionServers / HDFS ZKFC) will no longer be able to initialize a connection and the service will be degraded or unavailable!

It’s very easy to do a DoS attack on Zookeeper. On the other hand, because most of Zookeeper installation are inside trusted or semi-trusted zones, these attacks are often involuntary. It’s enough for a developer to be unresponsive and launch a custom code that opens looping zookeeper sessions without closing them. In this case, zookeeper is compromised, and all the components with it.

Solutions

Several workarounds can be set up independently or jointly.

Utilisation des Observers

The observers are particular zookeepers nodes:

  • They do not participate in Quorum
  • They synchronize on participating nodes
  • They transfer write requests to participating nodes

They therefore make it possible to increase the number of nodes without slowing down the election process.

Using iptables

It is possible to protect yourself from external DoS via iptables. Indeed we can limit the number of connections on the port of zookeeper (2181) by IP address. This makes it possible to put a lower limit on the Zookeeper maxConns and thus block a particular address without blocking access from another machine.

Example

Imagine the following cluster topology:

  • 3 edge nodes: edge1.adaltas.com, edge2.adaltas.com edge3.adaltas.com
  • 3 master nodes: master1.adaltas.com, master2.adaltas.com, master3.adaltas.com
  • n worker nodes: n’interviennent pas dans ce cas
  • These machines are located in the subnet: 10.10.10.0/24

The masters nodes are used as the elective quorum and the edges as observers in order to increase the load.

NB: the even number of nodes is not a problem, only 3 are participants

Zookeeper configuration

We set up these configurations (/etc/zookeeper/conf/zoo.cfg):

On the master nodes:

clientPort=2181
maxClientCnxns=200
peerType=participant
server.1=master1.adaltas.com:2888:3888
server.2=master2.adaltas.com:2888:3888
server.3=master3.adaltas.com:2888:3888
server.4=edge1.adaltas.com:2888:3888
server.5=edge2.adaltas.com:2888:3888
server.6=edge3.adaltas.com:2888:3888

On the edge nodes:

clientPort=2181
maxClientCnxns=200
peerType=observer
server.1=master1.adaltas.com:2888:3888
server.2=master2.adaltas.com:2888:3888
server.3=master3.adaltas.com:2888:3888
server.4=edge1.adaltas.com:2888:3888
server.5=edge2.adaltas.com:2888:3888
server.6=edge3.adaltas.com:2888:3888

On the master nodes, it is forbidden to communicate with external machines on port 2181 (only the local network is allowed) via the following iptables rule:

-A INPUT -m state --state NEW -m tcp -p tcp -s 10.10.10.0/24 --dport 2181 -j ACCEPT

Thus these zookeeper instances are only accessed by our internal services and processes.

Edge nodes limit communication with external machines on port 2181 to 100 simultaneous IP connections via the rule:

iptables -A INPUT -p tcp --syn --dport 2181 -m connlimit --connlimit-above 100 --connlimit-mask 32 -j REJECT --reject-with tcp-reset

Configuration Hadoop

For Hadoop services (HDFS ZKFC, HBase Master, etc.), the following connection string is specified:

master1.adaltas.com:2181,master2.adaltas.com:2181,master3.adaltas.com:2181

For “client” configurations (YARN containers, client hbase, third-party applications, etc.) the string is specified:

edge1.adaltas.com:2181,edge2.adaltas.com:2181,edge3.adaltas.com:2181

Thus, when an external job or application launches, it can not saturate the quorum and does not compromise the state of the cluster.

Go further, silotage of observers nodes

If an external “fraudulent” application uses the string edge1.adaltas.com:2181,edge2.adaltas.com:2181,edge3.adaltas.com:2181 then it is possible that it saturates the 3 observed nodes. Thus, although the cluster remains stable, some services will be unavailable, since customers will no longer be able to view Zookeeper.

We can limit the impact of the saturation of the observers by decomposing the chain into several substrings which will be specified in the client configurations. Example:

  • Chain 1: edge1.adaltas.com, edge2.adaltas.com
  • Chain 2: edge1.adaltas.com, edge3.adaltas.com
  • Chain 3: edge2.adaltas.com, edge3.adaltas.com

Thus, if string 1 is saturated, edge3 remains available. Applications targeting channel 2 and 3 will not be blocked.

Canada - Morocco - France

International locations

10 rue de la Kasbah
2393 Rabbat
Canada

We are a team of Open Source enthusiasts doing consulting in Big Data, Cloud, DevOps, Data Engineering, Data Science…

We provide our customers with accurate insights on how to leverage technologies to convert their use cases to projects in production, how to reduce their costs and increase the time to market.

If you enjoy reading our publications and have an interest in what we do, contact us and we will be thrilled to cooperate with you.