One of the main highlight of the Apache Storm is that it is a fault-tolerant, fast with no “Single Point of Failure” (SPOF) distributed application. We can install Apache Storm in as many systems as needed to increase the capacity of the application.

Let’s have a look at how the Apache Storm cluster is designed and its internal architecture. The following diagram depicts the cluster design.


Apache Storm has two type of nodes,Nimbus(master node) andSupervisor(worker node). Nimbus is the central component of Apache Storm. The main job of Nimbus is to run the Storm topology. Nimbus analyzes the topology and gathers the task to be executed. Then, it will distributes the task to an available supervisor.

A supervisor will have one or more worker process. Supervisor will delegate the tasks to worker processes. Worker process will spawn as many executors as needed and run the task. Apache Storm uses an internal distributed messaging system for the communication between nimbus and supervisors.

Components Description
Nimbus Nimbus is a master node of Storm cluster. All other nodes in the cluster are called asworker nodes. Master node is responsible for distributing data among all the worker nodes, assign tasks to worker nodes and monitoring failures.
Supervisor The nodes that follow instructions given by the nimbus are called as Supervisors. Asupervisorhas multiple worker processes and it governs worker processes to complete the tasks assigned by the nimbus.
Worker process A worker process will execute tasks related to a specific topology. A worker process will not run a task by itself, instead it createsexecutorsand asks them to perform a particular task. A worker process will have multiple executors.
Executor An executor is nothing but a single thread spawn by a worker process. An executor runs one or more tasks but only for a specific spout or bolt.
Task A task performs actual data processing. So, it is either a spout or a bolt.
ZooKeeper framework

Apache ZooKeeper is a service used by a cluster (group of nodes) to coordinate between themselves and maintaining shared data with robust synchronization techniques. Nimbus is stateless, so it depends on ZooKeeper to monitor the working node status.

ZooKeeper helps the supervisor to interact with the nimbus. It is responsible to maintain the state of nimbus and supervisor.

Storm is stateless in nature. Even though stateless nature has its own disadvantages, it actually helps Storm to process real-time data in the best possible and quickest way.

Storm isnot entirelystateless though. It stores its state in Apache ZooKeeper. Since the state is available in Apache ZooKeeper, a failed nimbus can be restarted and made to work from where it left. Usually, service monitoring tools likemonitwill monitor Nimbus and restart it if there is any failure.

Apache Storm also have an advanced topology calledTrident Topologywith state maintenance and it also provides a high-level API like Pig. We will discuss all these features in the coming chapters.








ApacheStorm有两种类型的节点,nimbus(master node主节点)和supervisor(worker node工作节点)。













Storm通过apache zookeeper存储其状态。

既然apache zookeeper可以保存storm集群的状态,那么失败的nimbus就可以在它停止运行的节点上重新启动。


Apachestorm也有一个先进的topology维护状态,称为Trident Topology ,它还提供了一个像Pig那样的高水平的API。


Leave A Comment