How to decide value of Zookeeper Heap size ?

Recently I came across a situation where I wanted to decide the heap size of zookeeper. Along with this zookeeper was failing with Out Of Memory issue.

At zookeeper startup, the process will read the last snapshot file and create a DataTree. DataTree has a hash table of  znodes. The process deserialize the snapshot file (which has all znodes) and add each znodes present into the hash table. There are ephemeral znodes which would have been created in previous session, that also will be present in the snapshot and will be added into the hash table. At the end of adding all data nodes, the previous session ephemeral nodes which is present are deleted from Hash table. So if there are lot of ephemeral znodes from previous session, then zookeeper startup will need more heap to store all of them and they will be removed immediately.

To sum this up zookeeper stores znodes and latest snapshot in heap.

I did some research on ideal value/formula to set right heap size for zookeeper server. There is no such ideal value but we can calculate the value of heap size based on below factors :

1. The number of znodes in zookeeper

2. Number of components using zookeeper (for ex : Hbase, Kafka, Solr etc…)

3. Size of all the znodes in zookeeper

4. Size of snapshot in zookeeper

Below command will give the number of znodes (zk_znode_count) and its size(zk_approximate_data_size) in bytes.

# echo mntr | nc localhost 2181
zk_version 3.4.6-76--1, built on 11/07/2017 01:38 GMT
zk_avg_latency 0
zk_max_latency 1422
zk_min_latency 0
zk_packets_received 31318142
zk_packets_sent 31318171
zk_num_alive_connections 3
zk_outstanding_requests 0
zk_server_state follower
zk_znode_count 550                        <----- Total number of znodes
zk_watch_count 27
zk_ephemerals_count 13
zk_approximate_data_size 1073741824       <---- Size in Bytes
zk_open_file_descriptor_count 62
zk_max_file_descriptor_count 4096


– We also need to calculate the size of last snapshot as it is also stored in heap. Execute below command to get data dir for zookeeper :

# grep dataDir /etc/zookeeper/conf/zoo.cfg
# du -sch /Path/to/dataDir/version-X/*

In my case size of znodes and latest snapshot is approximately 1GB. In this case bumping heap size to at least 4GB would be a good option.

One should be careful while deciding the heap size of zookeeper. Swapping should never be allowed to happen in the ZooKeeper server. So generally it is recommended that ZooKeeper servers have a reasonably high memory configured.



Leave a Reply

Your email address will not be published. Required fields are marked *