Zookeeper
Astra uses Zookeeper as a metadata store accessible by all components, with Apache Curator recipes wrapping all major node operations.
Recommended architecture
Five nodes in quorum
Observers serving all direct client traffic
Quorum members excluded from DNS, only serving forwarded observer requests
Recommended configs
znode.container.maxNeverUsedIntervalMs
This is the amount of time a container can exist without children before it is eligible for deleting. This happens when a node crashes while attempting to create a znode, and only the parent is left (partitioned metadata stores).
https://zookeeper.apache.org/doc/r3.6.1/zookeeperAdmin.html#sc_performance_options
Note this is a Java system property, and must be set similar to the following:
Troubleshooting
jute.maxbuffer
Zookeeper is designed for small files, and not a large amount of them per path. This is enforced with a file size limit, that will return an error when attempting to read values larger than this configured amount. This error will typically occur when attempting to list children on a specific path, and can exceed the configure jute.maxbuffer.
The default jute.maxbuffer
value for Zookeeper is 1MB. Changes to this limit should be made on both the server and clients. For additional documentation, Solr provides an excellent writeup about this - https://solr.apache.org/guide/7_4/setting-up-an-external-zookeeper-ensemble.html.