Kafka Consumer

Kafka Consumer

The Kafka Consumer is able to consume messages from one or more Kafka topics. Kafka topics are the main way in which Kafka organizes data. Each topic is chopped up in separate partitions which will contain part of the messages in the topic. Some of the major features of Kafka are its fault tolerance and throughput. This is achieved by replicating these topic partitions across multiple servers (called Kafka brokers).

Note that the term "consume" is used loosely here because the consumer will never actually remove messages from a Kafka cluster. Although a Kafka cluster can dispose of messages through settings related to log compaction and retention, this will not be triggered by a consumer. Instead the consumer keeps track of an offset per topic. "Consuming a message" then means raising the offset by one. The current offset state for this and all other consumers is tracked within the Kafka cluster's "__consumer_offsets" topic.

Kafka consumers can be organized in groups that are responsible for consuming one or more topics once. Each group gets an internal identifier, the Group ID. Since Kafka Consumers do not actually remove messages from topics a Kafka Consumer with a different Group ID can consume the same topic that might be consumed by another consumer with a different Group ID.

A Kafka message consists of a value, an optional key, a timestamp and a set of headers. Most important of these is the value of the message which is the payload. Kafka stores this payload (as well as the message key and header values) as binary data making no assumptions about the structure.

Configuring a Kafka Consumer requires a list of bootstrap servers. These are the addresses of Kafka brokers in the cluster which the Kafka Consumer will contact during startup.

There is also support for further customization of the Kafka Consumer through custom properties. See the documentation on the official Kafka site for more information, https://kafka.apache.org/documentation/#consumerapi. Note that custom properties should be used with care because they will overwrite other behavior.

Limitations of ConnectPlaza's Kafka Consumer:

  • The Kafka Consumer can only deal with textual data. Although we do support specifying encodings.
  • The Kafka Consumer uses an auto commit strategy for offsets. We currently do not support more advanced commit strategies.
  • There is no support for transport security, authentication or authorization.

In the table below, you will find an explanation of these properties. All attributes with a ‘*’ are mandatory.

Attribute Description
Consumer autostart Consumer will be started at startup of interface.
MessagePart Out Name of the MessagePart in the outgoing ConnectMessage.
Topic Specification Type The manner in which to define the set of topics this consumer will subscribe to.
Topics Only available if Topic Specification Type is set to LIST. A comma separated list of Kafka topics to listen to.
Topic Pattern Only available if Topic Specification Type is set to PATTERN. A Java regex specifying the topics to listen to.
Group ID* The Kafka consumer group id.
Kafka Bootstrap Servers* A comma-separated list of Kafka brokers.
Polling Rate The polling rate in milliseconds.
Source Encoding The source encoding used when reading the value, key and headers of a Kafka message.
Mapped Headers A comma-separated list of headers that are mapped from the Kafka message to the ConnectMessage.
Auto Offset Reset The offset to use if there exists no initial offset for this consumer group and a specific topic: EARLIEST, reset to the earliest offset; LATEST, reset to the latest offset; NONE, throw exception.
Allow Multi Fetch Allow the consumer to fetch multiple records at once.
Max Poll Records Only available if Allow Multi Fetch is set to true. The maximum number of records returned in a single poll.
Kafka Consumer Properties Filename The name of the Java properties file containing additional config for the Kafka Consumer.