Wednesday, July 24, 2024

Kafka Metrics to Monitor

Kafka metrics can be broken down into three categories:

  • Kafka server (broker) metrics
  • Producer metrics
  • Consumer metrics

Because Kafka relies on ZooKeeper to maintain state, it’s also important to monitor ZooKeeper.

Broker Metrics

Broker metrics can be broken down into three classes:
  • Kafka-emitted metrics
  • Host-level metrics
  • JVM garbage collection metrics
















Kafka Emitted Metrics

Type

Description

Resource: Availability

Number of unreplicated partitions

Resource: Availability

Number of offline partitions

Performance

Total time (in ms) to serve the specified request (Produce/Fetch)

Throughput

Aggregate incoming/outgoing byte rate

Throughput

Number of (producer|consumer|follower) requests per second

  • Metric to watch: UnderReplicatedPartitions
  • Metric to alert on: OfflinePartitionsCount (controller only)
  • Metric to watch: TotalTimeMs
  • Metric to watch: RequestsPerSec

Host Level

Type

Description

Utlization

Disk usage

Utlization

CPU usage

Utlization

Network Bytes sent / received

JVM level

Type

Description

Utlization

Total number of GC processes

Utlization

Total time spent in GC



















Producer

JMX attribute

Description

Type

Response-rate

Average number of responses received per second

Throughput

Request-Rate

Average number of requests sent per second

Throughput

Request-Latemcy

Average request latency (in ms)

Throughput

outgoing-byte-rate

Average number of outgoing/incoming bytes per second

Throughput

batch-size-avg

The average number of bytes sent per partition per request

Throughput




















Consumer

JMX attribute

Description

Type

byte-consumer-rate

Average number of bytes consumed per second for a specific topic or across all topics.

Throughput

record-consume-Rate

Average number of records consumed per second for a specific topic or across all topics

Throughput

Fetch-rate

Number of fetch requests per second from the consumer

Throughput





















Name    

Description

Type

outstanding_requests   

Number of requests queued  

Saturation

avg_latency

Average time taken to respond to client requests

Throughput

num_alive_connections

Number of clients connected to ZooKeeper

Availability




References



No comments: