Distributed Systems12 lessons20 quiz questions

Apache Kafka

Master Apache Kafka for software engineering interviews.

What You Will Learn

  • Kafka Architecture — Brokers, Topics, Partitions, Offsets
  • Producers Deep Dive — Acks, Idempotence, Batching, Compression
  • Consumers & Consumer Groups — Rebalancing, Offsets, Poll Loop
  • Partitioning & Message Ordering — Keys, Custom Partitioners, Guarantees
  • Replication & Fault Tolerance — ISR, Leader Election, min.insync.replicas
  • Exactly-Once Semantics — Idempotent Producer, Transactions, Read-Committed Consumer
  • Kafka Streams
  • Kafka Connect & Schema Registry
  • Performance Tuning
  • Kafka Security
  • Kafka in Microservices
  • Production Operations

Overview

Master Apache Kafka for software engineering interviews. Session 1: Kafka Architecture Apache Kafka is a distributed, fault-tolerant, high-throughput event streaming platform. Before writing a single line of producer or consumer code, you must internalize the physical and logical layout of a Kafka cluster — because every configuration decision flows from this mental model. The Cluster A Kafka cluster is a group of brokers — individual server processes, each with a unique integer ID. One broker is elected the Controller, responsible for administrative duties: partition leader elections, topic creation and deletion, and cluster metadata propagation. If the Controller broker dies, another is automatically elected. Topics and Partitions A topic is a named, durable, ordered log of records — conceptually similar to a database table, but append-only and replayable. Every topic is divided into one or more partitions. A partition is the fundamental unit of parallelism and ordering in Kafka. Each record in a partition receives a monotonically increasing integer called an offset, starting at Offsets uniquely identify a record within a partition. Consumers track their position by storing the last processed offset. Kafka guarantees ordering within a partition but not across partitions. This is a critical design constraint: if you need all events for a given entity (e.g., a user) to be processed in order, all those events must land in the same partition. Replication Each partition has a configurable replication factor (typically 3). One replica is the Leader — it handles all producer writes and consumer reads. The other replicas are Followers — they replicate data from the leader asynchronously. The set of followers that are fully caught up is called the ISR (In-Sync Replicas). ZooKeeper vs KRaft Historically, Kafka depended on Apache ZooKeeper for cluster coordination — storing broker registrations, topic metadata, and controller election state. ZooKeeper is a separate distributed system with its own operational overhead. Kafka 2.8introduced KRaft (Kafka Raft) — a built-in consensus mechanism that eliminates the ZooKeeper dependency entirely. In KRaft mode, a quorum of broker nodes use the Raft protocol to manage cluster metadata directly. As of Kafka 3.3, KRaft is production-ready, and ZooKeeper mode is deprecated. Key KRaft benefits: Simpler operations (one system instead of two) Faster controller failover (milliseconds vs. seconds) Supports larger clusters (millions of partitions) Unified security model Storage Layout Each partition is stored as a directory of segment files on the broker's disk. A segment is a file (binary data) paired with an file (sparse offset index for fast seeks) and a file (timestamp index). When a segment reaches (default 1 GB) or , a new segment is opened. Old segments are deleted or compacted based on the retention policy.

Continue learning Apache Kafka with full lessons, quizzes, and interactive exercises.

Continue Learning on Guru Sishya →

Sample Quiz Questions

1. What is the maximum number of consumers in a consumer group that can actively consume from a topic with 6 partitions?

Understand·Difficulty: 2/5

2. A producer sends a message with acks=all to a topic with replication.factor=3 and min.insync.replicas=2. Two brokers fail simultaneously. What happens?

Apply·Difficulty: 4/5

3. Which Kafka configuration ensures a consumer reads only messages from committed transactions?

Remember·Difficulty: 3/5

+ 17 more questions available in the full app.

Related Topics

Master Apache Kafka for Your Next Interview

Get access to full lessons, adaptive quizzes, cheat sheets, code playground, and progress tracking — completely free.