You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This is a rack-aware tool for assigning Kafka partitions to brokers that minimizes data movement. It also includes the ability to inspect the current live brokers in the cluster and the current partition assignment.
4
+
5
+
**Using this tool will greatly simplify operations like decommissioning a broker, adding a new broker, or replacing a broker.**
6
+
7
+
# Why is this necessary?
8
+
Kafka's built-in algorithm is easy to use and monitor, but it does not take into account existing assignments of partitions to nodes. Instead, the burden is on the operator to either move entire topics across brokers, or come up with a sane way of moving some number of partitions of existing topics. This is extremely disruptive.
9
+
10
+
This tool _minimizes_ the number of partitions already assigned that need to leave a given node, while ensuring that each broker is responsible for a similar number of partitions. This enables use cases like node replacement, in which we would like to bring up a broker that is responsible for the same data as a misbehaving broker that it is replacing.
11
+
12
+
# How does this work?
13
+
This tool uses a strategy that behaves similarly to [Apache Helix](http://helix.apache.org)'s auto-rebalancing algorithm. It first assigns as many already-assigned partitions back to nodes as it can (while ensuring that no node is overloaded), and then evenly assigns all other partitions such that every node eventually ends up responsible for roughly the same number of partitions.
The output JSON can then be fed into Kafka's reassign partitions command. See [here](http://kafka.apache.org/0100/ops.html#basic_ops_partitionassignment) for instructions.
50
+
51
+
### Example: reassign partitions to all but a few live hosts
52
+
This mode is useful for decommissioning or replacing a node. The partitions will be assigned to all live hosts, excluding the hosts that are specified.
The output JSON can then be fed into Kafka's reassign partitions command. See [here](http://kafka.apache.org/0100/ops.html#basic_ops_partitionassignment) for instructions.
58
+
59
+
### Example: reassign partitions to specific hosts
60
+
Note that in this mode, it is expected that every host that should own partitions should be specified, including existing ones.
The output JSON can then be fed into Kafka's reassign partitions command. See [here](http://kafka.apache.org/0100/ops.html#basic_ops_partitionassignment) for instructions.
0 commit comments