20 October 2020

#Apache_Kafka

#Apache_Kafka

Key Concepts


S.No Topic Sub-Topics
1Introduction to KafkaKafka overview, history, features, use cases, architecture basics
2Kafka ComponentsBroker, Topic, Partition, Producer, Consumer
3Kafka Cluster SetupSingle-node setup, multi-node setup, configuration, Zookeeper, directories
4Kafka TopicsCreate topics, partitions, replication, retention policy, configurations
5ProducersProducer API, sending messages, key/value, partitioning, batching
6ConsumersConsumer API, subscribing, polling, offsets, consumer groups
7Kafka Partitions & OffsetsPartitioning strategy, offset management, auto commit, manual commit, rebalance
8Message SerializationString, JSON, Avro, Protobuf, Schema Registry
9Kafka ConfigurationBroker configs, producer configs, consumer configs, tuning, environment variables
10Kafka Logs & StorageLog segments, retention, compaction, file structure, cleanup policy
11Kafka ReliabilityReplication factor, ISR, acknowledgments, min.insync.replicas, failover
12Kafka Security BasicsSSL encryption, SASL authentication, ACLs, authentication mechanisms, authorization
13Kafka MonitoringJMX metrics, Kafka Manager, Cruise Control, Grafana, Prometheus
14Kafka Streams BasicsKStream, KTable, topology, stream processing, stateless operations
15Kafka Streams AdvancedStateful operations, windowing, joins, aggregations, materialized views
16Kafka Connect BasicsSource connectors, sink connectors, connector configuration, tasks, offsets
17Kafka Connect AdvancedCustom connectors, transformations, error handling, scaling, monitoring connectors
18Kafka TransactionsIdempotent producer, exactly-once semantics, transactional producer, transaction coordinator, fencing
19Kafka Consumer GroupsGroup management, partition assignment, load balancing, rebalance listeners, offset commits
20Kafka High AvailabilityCluster replication, leader election, fault tolerance, Zookeeper failover, broker recovery
21Kafka Performance TuningBatch size, linger.ms, compression, fetch.min.bytes, replication tuning
22Kafka Message OrderingPartition ordering, key-based partitioning, idempotent producers, transactions, guarantees
23Kafka Advanced SecurityOAuth, Kerberos, TLS configuration, SASL mechanisms, access control policies
24Kafka DeploymentDocker setup, Kubernetes deployment, Helm charts, cloud setup, multi-cluster
25Kafka Backup & RecoveryMirrorMaker, snapshots, log retention, disaster recovery planning, cross-cluster replication
26Kafka Schema ManagementSchema registry, Avro schemas, Protobuf schemas, versioning, compatibility
27Kafka IntegrationSpring Kafka, Kafka with Spark, Kafka with Flink, Python Kafka client, REST proxy
28Kafka TestingUnit testing, embedded Kafka, integration testing, Testcontainers, mocks
29Kafka Use CasesEvent streaming, real-time analytics, log aggregation, metrics, messaging pipelines
30End-to-End ProjectCluster setup, producer & consumer apps, stream processing, monitoring, deployment

Interview question

Basic Level

  1. What is Apache Kafka?
  2. What are the main features of Kafka?
  3. What are the key use cases of Kafka?
  4. Explain Kafka?s publish-subscribe messaging system.
  5. What is a Kafka topic?
  6. What are Kafka partitions?
  7. What is a Kafka producer?
  8. What is a Kafka consumer?
  9. What is a Kafka broker?
  10. What is a Kafka cluster?
  11. What is a Kafka message?
  12. What is an offset in Kafka?
  13. What is a consumer group in Kafka?
  14. What is the difference between a Kafka queue and a topic?
  15. How do producers send data to Kafka?
  16. How do consumers read data from Kafka?
  17. What is Zookeeper in Kafka (pre-KRaft)?
  18. What is Kafka KRaft mode?
  19. What is the difference between Zookeeper and KRaft?
  20. What is Kafka retention policy?
  21. How do you configure message retention in Kafka?
  22. What are Kafka replicas?
  23. What is the difference between leader and follower replicas?
  24. What happens if a broker fails in Kafka?
  25. How do you create and list Kafka topics?

Intermediate Level

  1. How does Kafka guarantee message ordering?
  2. What is log compaction in Kafka?
  3. What is the difference between log compaction and log retention?
  4. How do you configure acknowledgments (acks) in Kafka?
  5. What is the difference between acks=0, acks=1, and acks=all?
  6. What is idempotent producer in Kafka?
  7. How does Kafka ensure message durability?
  8. What are Kafka consumer offsets?
  9. How does Kafka track consumer offsets?
  10. What is the difference between earliest and latest offset reset?
  11. How do you commit offsets in Kafka?
  12. What is the difference between automatic and manual offset commit?
  13. What are Kafka serializers and deserializers?
  14. What is a Kafka partition key?
  15. How do you achieve message key-based ordering in Kafka?
  16. What is Kafka Streams?
  17. What is KSQL (ksqlDB)?
  18. What are Kafka connectors?
  19. What is Kafka Connect?
  20. What are the different modes of Kafka Connect (standalone vs distributed)?
  21. How do you scale Kafka consumers?
  22. What is consumer rebalancing in Kafka?
  23. What are rebalancing strategies in Kafka?
  24. What is the difference between at-most-once, at-least-once, and exactly-once delivery semantics?
  25. How does Kafka achieve exactly-once semantics?

Advanced Level

  1. Explain Kafka?s architecture in detail.
  2. How does Kafka handle high throughput?
  3. What is the role of page cache in Kafka performance?
  4. How does Kafka achieve fault tolerance?
  5. What is ISR (In-Sync Replicas) in Kafka?
  6. How do leader elections work in Kafka?
  7. What is unclean leader election?
  8. What is the difference between min.insync.replicas and replication factor?
  9. What is rack awareness in Kafka?
  10. How do you monitor Kafka performance?
  11. What are key Kafka metrics to monitor?
  12. What is Kafka Controller?
  13. What happens when the Kafka controller fails?
  14. How does Kafka handle backpressure?
  15. What is a dead letter queue (DLQ) in Kafka?
  16. How do you implement retries in Kafka?
  17. What is Kafka?s compaction log cleaner thread?
  18. What is throttling in Kafka?
  19. How do you secure Kafka with SSL?
  20. How do you secure Kafka with SASL?
  21. What is the difference between SASL/PLAIN, SASL/SCRAM, and SASL/GSSAPI?
  22. How do you enable ACLs in Kafka?
  23. How do you handle schema evolution in Kafka?
  24. What is Confluent Schema Registry?
  25. What is the role of Avro, Protobuf, and JSON schemas in Kafka?

Expert Level

  1. What is Kafka KRaft mode and how does it replace Zookeeper?
  2. Explain Kafka?s Raft protocol.
  3. What are the benefits of KRaft over Zookeeper?
  4. How do you migrate from Zookeeper-based Kafka to KRaft mode?
  5. How do you tune Kafka for high throughput?
  6. How do you tune Kafka for low latency?
  7. How do you size partitions in Kafka?
  8. What are the trade-offs between fewer vs more partitions?
  9. What is Kafka tiered storage?
  10. How do you implement geo-replication in Kafka?
  11. What is MirrorMaker in Kafka?
  12. What is MirrorMaker 2.0 and how does it work?
  13. What are Kafka quotas? How do you use them?
  14. How do you implement multi-tenancy in Kafka?
  15. What are Kafka transactions?
  16. How does Kafka implement exactly-once processing with transactions?
  17. What is the difference between Kafka Streams and Flink?
  18. How does Kafka Streams handle stateful operations?
  19. What is state store in Kafka Streams?
  20. How do you achieve high availability in Kafka Streams?
  21. Compare Kafka with RabbitMQ, ActiveMQ, and Pulsar.
  22. Compare Kafka with Azure Event Hubs and AWS Kinesis.
  23. What are common Kafka anti-patterns?
  24. What are real-world best practices for running Kafka in production?
  25. How do you design a large-scale Kafka deployment for millions of messages per second?


Related Topics


   Kafka