Skip to main content

Apache Kafka - Tools and Application

Apache Kafka - Tools

Apache kafka - tools


Kafka Tool packaged under “org.apache.kafka.tools.*. Tools are categorized into system tools and replication tools.

System Tools
System tools can be run from the command line using the run class script. The syntax is as follows −

bin/kafka-run-class.sh package.class - - options
Some of the system tools are mentioned below −

Kafka Migration Tool − This tool is used to migrate a broker from one version to an-other.

Mirror Maker − This tool is used to provide mirroring of one Kafka cluster to another.

Consumer Offset Checker − This tool displays Consumer Group, Topic, Partitions, Off-set, logSize, Owner for the specified set of Topics and Consumer Group.

Replication Tool

Kafka replication is a high level design tool. The purpose of adding replication tool is for stronger durability and higher availability. Some of the replication tools are mentioned below −

Create Topic Tool − This creates a topic with a default number of partitions, replication factor and uses Kafka's default scheme to do replica assignment.

List Topic Tool − This tool lists the information for a given list of topics. If no topics are provided in the command line, the tool queries Zookeeper to get all the topics and lists the information for them. The fields that the tool displays are topic name, partition, leader, replicas, isr.

Add Partition Tool − Creation of a topic, the number of partitions for topic has to be specified. Later on, more partitions may be needed for the topic, when the volume of the topic will increase. This tool helps to add more partitions for a specific topic and also allows manual replica assignment of the added partitions.

Apache Kafka - Applications

Kafka supports many of today's best industrial applications. We will provide a very brief overview of some of the most notable applications of Kafka in this chapter.

Twitter

Twitter is an online social networking service that provides a platform to send and receive user tweets. Registered users can read and post tweets, but unregistered users can only read tweets. Twitter uses Storm-Kafka as a part of their stream processing infrastructure.

LinkedIn

Apache Kafka is used at LinkedIn for activity stream data and operational metrics. Kafka messaging system helps LinkedIn with various products like LinkedIn Newsfeed, LinkedIn Today for online message consumption and in addition to offline analytics systems like Hadoop. Kafka’s strong durability is also one of the key factors in connection with LinkedIn.

Netflix

Netflix is an American multinational provider of on-demand Internet streaming media. Netflix uses Kafka for real-time monitoring and event processing.

Mozilla

Mozilla is a free-software community, created in 1998 by members of Netscape. Kafka will soon be replacing a part of Mozilla current production system to collect performance and usage data from the end-user’s browser for projects like Telemetry, Test Pilot, etc.

Oracle

Oracle provides native connectivity to Kafka from its Enterprise Service Bus product called OSB (Oracle Service Bus) which allows developers to leverage OSB built-in mediation capabilities to implement staged data pipelines.

Comments

Popular posts from this blog

JavaScript Array Methods

JavaScript Arrays JavaScript arrays are used to store multiple values in a single variable. Displaying Arrays In this tutorial we will use a script to display arrays inside a <p> element with id="demo": Example < p  id= "demo" > < /p > < script > var cars = ["Saab", "Volvo", "BMW"]; document.getElementById("demo").innerHTML = cars; < /script > The first line (in the script) creates an array named cars. The second line "finds" the element with id="demo", and "displays" the array in the "innerHTML" of it. Example var cars = ["Saab", "Volvo", "BMW"]; Spaces and line breaks are not important. A declaration can span multiple lines: Example var cars = [     "Saab",     "Volvo",     "BMW" ]; Never put a comma after the last element (like &