Apache Kafka is a highly configurable system and it provides many configurable parameters. It is a distributed publish-subscribe messaging system. It was originally developed at LinkedIn and later on, became a part of the Apache project.
Kafka Broker Configuration
Kafka is fast, scalable, durable, fault-tolerant, and application-distributed by design. Here we will discuss some key Apache Kafka Broker Configuration and their parameters.
A producer can be any application that can publish messages on a topic.
A consumer can be any application that subscribes to a topic, and consume the messages.
Topics are broken up into order commit logs called Partitions.
A Kafka cluster is a set of services, each of which is called a broker.
A topic is a category or feed name to which records are published.
The zookeeper works managing and coordinating Kafka broker.
This parameter takes the Zookeeper connection string. The connection string is simply a hostname with a port number. As we know Kafka uses Zookeeper for various coordination purposes. So, every broker must know the Zookeeper’s address. Zookeeper.Connect parameters is also necessary to form a cluster.
When all brokers are running on different systems how do they know about each other?
If they don’t know about each other they are not part of the cluster. So, the Zookeeper is the connecting link among all brokers to form a cluster.
If a user wants to delete a topic they can use a topic management tool to delete a topic. But by default, a topic is not allowed. The user cannot remove a topic because the default value for this parameter is false that’s a reasonable protection for the production environment.
But in a development or testing environment, it allows deleting a topic. So, if a user wants to delete a topic they have to set this parameter to enable.
If a producer starts sending messages to a non-existing topic Kafka will create the topic automatically and accept the data. This behavior is suitable in the development environment but in the production environment, the user may want to implement a more controllable approach.
The user can set this parameter to false and Kafka will stop creating topics automatically. Although, the user can create topics manually using a topic management tool.
4. Default.Replication.Factor & Num.Partition
These two parameters are quite straightforward the default value for both themes is one. They are effective when users have auto-create topics enabled. So, if Kafka is creating a topic automatically the new topic will have only one partition and a single copy. If users want to add some other values they can change the default settings.
5. Log.retention.ms & log.retention.bytes
Whatever data you send to Kafka it will not be retained by Kafka forever. Kafka is not a database, we cannot send data to Kafka for storage so that we can query it later. It is a message broker it should deliver the data to the consumer and then clean it up. There is no reason to retain messages for longer than needed. Kafka provides two options to configure the retention period.
The default option is retention by time, and the default retention period is seven days. So, in this case, Kafka will clean up all the messages older than seven days.
To change the duration we have to specify a value for log.retention.ms configuration. Kafka also provides another option to define this retention period.
Users can specify it by size by using the second parameter log.retention.bytes, but this size applies to partitions. So, if a user set log.retention.bytes = 1GB Kafka will trigger a cleanup activity when the partition size reaches 1GB. If a user specifies both of these configurations the cleanup will start on meeting either of the criteria.
Author: TCF Editorial
Copyright The Cloudflare.