20 October 2020

#Apache_Kafka

Apache Kafka
Define the role of Kafka Streams API and Kafka Connector API.
Define consumer lag in Apache Kafka.
What are topics in Apache Kafka?
What are consumers in Apache Kafka?
What are producers in Apache Kafka?
What is a broker in Apache Kafka?
What is a consumer group in Apache Kafka?
What is meant by ZooKeeper in Apache Kafka?
What are some differences between Apache Kafka and Flume?
What is the maximum size of a message that can be received by Apache Kafka?
What is the role of the Partitioning Key?
What is the role of replicas in Apache Kafka?
What is the purpose of ISR in Apache Kafka?
What is meant by partition offset in Apache Kafka?
What is the importance of replication in Kafka?
What is the best way to start the Kafka server?
What is meant by multi-tenancy in Apache Kafka?
What method does Apache Kafka use to connect with clients and servers?
What is meant by the Replication Tool?
What are the differences between Redis and Kafka?
What is the optimal number of partitions for a topic?
What is the Kafka MirrorMaker?
What is the role of the Kafka Migration Tool?
What are name restrictions for Kafka topics?
What is the Confluent Replicator?
What is the command to start ZooKeeper?
What is meant by Kafka Connect?
What is the need for message compression in Apache Kafka?
What are some disadvantages of message compression in Apache Kafka?
What guarantees does Kafka provide?
What do you know about log compaction in Kafka?
What do you understand about quotas in Kafka?
What is meant by cluster id in Kafka?
What are the responsibilities of a Controller Broker in Kafka?
What causes OutOfMemoryException?
What is meant by the Kafka schema registry?
What can Kafka Monitoring be used to do?
What role does the Kafka consumer API and Kafka producer API play?
What is the best method to determine the number of topics in a single Kafka broker?
What is the ZooKeeper ensemble?
What are Znodes?
What are the types of Znodes?
What are ZooKeeper watches?
What is ZooKeeper quorum?
What are the benefits of a distributed application?
What are some disadvantages of a distributed application?
What is meant by the ZooKeeper Atomic Broadcast (ZAB) Protocol?
What are ZooKeeper barriers?
What is the daemon name for ZooKeeper?
What is Kafka’s producer acknowledgment? What are the various types of acknowledgment settings that Kafka provides?
What does it indicate if a replica stays out of ISR for a long time?
What is the method to create Kafka producer API?
What is the retention policy for Kafka records in a Kafka cluster?
What are the core APIs provided in Kafka platform?
What is the difference between Apache Kafka and Apache Storm?
What do you know about a partition key?
What is a way to balance masses in writer once one server fails?
What is multi-tenancy?
What do you mean by Stream Processing in Kafka?
What ensures load balancing of the server in Kafka?
What roles do Replicas and the ISR play?
What is the way to send large messages with Kafka?
What are the benefits of using Kafka than other messaging services like JMS, RabbitMQ doesn’t provide?
What is the main difference between Kafka and Flume?
What are Kafka Topics?
What is the real-world use case of Kafka, which makes different from other messaging framework?
What square measure the most options of writer that build it appropriate for information integration and processing in real-time?
What are the three main system tools within Apache Kafka?
What is the maximum message size that can be handled and received by Apache Kafka?
What does it indicate if replica stays out of ISR for a long time?
What are the key components of Kafka?
What is the role of the ZooKeeper in Kafka?
What are the key benefits of using storm for real time processing?
What is Broker and how Kafka utilize broker for communication?
What Is ZeroMQ?
What happens if the preferred replica is not in the ISR?
What is the replica? What does it do?
What is the maximum size of a message that can be received by the Kafka?
What is Streams API?
What is Apache Kafka?
What is Consumer Group?
What is Partition?
What are main APIs of Kafka?
What is consumers or users?
What is the process for starting a Kafka server?
What can you do with Kafka?
What is the purpose of retention period in Kafka cluster?
What are the types of traditional method of message transfer?
What does ISR stand in Kafka environment?
What is Geo-Replication in Kafka?
What is the role of Consumer API?
What is the role of Connector API?
What is Data Log?
What are the types of System tools?
What are Replication Tool and its types?
What is Importance of Java in Apache Kafka?
What are Guarantees provided by Kafka?
What is Kafka?
What are various components in Kafka?
What are consumers or users in Kafka?
What is the concept of leader and follower in Kafka?
What are the main APIs of Kafka?
What is the traditional method of message transfer?
What Is The Benefits Of Apache Kafka Over The Traditional Technique?
What are Broker Configuration Files?
What is Log Compaction?
What are the key Features of Kafka?
What is a Topic? How Kafka use the topic to communicate from the producer to consumer?
What is a Partition?
What is a Partition offset?
What is Dumb Broker/Smart Producer vs Smart Broker/Dumb Consumer? What model does Apache Kafka follow?
What is meant by fault tolerance?
What is an offset in Kafka?
What are the different ways to commit an offset?
What is meant by Kafka producer Acknowledgement?
What are the different types of acknowledgment settings provided by Kafka?
What is Kafka and what are other alternatives to Kafka?
what are the different components of Kafka?
What is ZooKeeper in Kafka? Can we use Kafka without ZooKeeper?
What is an offset in Kafka?
What is leader and follower in Kafka environment?
What is replication critical in Kafka environment?
What are the main advantages of using Kafka?
What ensures load balancing in Kafka?
What is Kafka cluster and what is the key benefits of creating Kafka cluster?
What is a role of consumer in Kafka?
What is the working principle of Kafka?
What are the key advantages of using Kafka?
What is the use case where Kafka doesn’t fit?
What is meant by Consumer Lag? How can you monitor it?
What is a producer in Kafka?
What are the different types of Kafka producer APIs?
What is Kafka Mirror Maker?
What is the core API in Kafka?
What is partition key in Kafka?
What does series in Kafka?
What is the difference between a shared message queue and traditional publisher-subscriber message queue?
what is the consumer group in Kafka?
What are the main features of Kafka that make it suitable for data integration and data processing in real-time?
What is the poll loop in Kafka?
What is offset.
What is Leader and Follower.
What is Log Anatomy
What is Topic Replication Factor
Name the various types of Kafka producer API.
Name the configuration file to be used to set up ZooKeeper properties in Kafka.
Explain partitions in Apache Kafka.
Explain the retention period in an Apache Kafka cluster.
Explain the roles of leader and follower in Apache Kafka.
Explain fault tolerance in Apache Kafka.
Explain the topic replication factor.
Explain the scalability of Apache Kafka.
Explain how topics can be added and removed.
Explain how topic configurations can be modified in Apache Kafka.
Explain message compression in Apache Kafka.
Explain producer batch in Apache Kafka.
Explain how Apache Kafka provides security.
Explain the graceful shutdown in Kafka.
Explain customer serialization and deserialization in Kafka.
What is Cages in ZooKeeper?
What is CLI in ZooKeeper.
What is role of Streams API?
Wha is “Log Anatomy”.
What geo-replication is within Apache Kafka.
What is Topic Replication Factor
Explain role of the offset.
Explain the role of the Kafka Producer API.
Explain how you can reduce churn in ISR? When does broker leave the ISR?
Explain the concept of Leader and Follower.
Explain Apache Kafka Use Cases?
Explain some Kafka Streams real-time Use Cases.
Explain steps for Kafka installation?
Why is Apache Kafka preferred over traditional messaging techniques?
Why is the Kafka broker said to be “dumb”?
Why do you think the replications to be dangerous in Kafka?
Why is Kafka preferred over traditional message transfer techniques?
Why is Kafka technology significant to use?
Why are Replications critical in Kafka?
Why Should we use Apache Kafka Cluster?
Why replication is required in Kafka?
Why we need Kafka rather than other messaging services?
Which components are used for stream flow of data?
How are partitions distributed in an Apache Kafka cluster?
How is load balancing maintained in Kafka?
How long are messages retained in Apache Kafka?
How can Kafka be tuned for optimal performance?
How does one view a Kafka message?
How can all brokers available in a cluster be listed?
How can Apache Kafka be used with Python?
How can you list the topics being used in Apache Kafka?
How can load balancing be ensured in Apache Kafka when one Kafka fails?
How can large messages be sent in Apache Kafka?
How can Kafka retention time be changed at runtime?
How can a cluster be expanded in Kafka?
How does Kafka ensure minimal data modification when data passes from the producer to the broker to the consumer?
How can you write data from Kafka to a database?
How can we create Znodes?
How can we remove Znodes?
How does Kafka perform better than RabbitMQ?
How do you get Kafka to perform in a FIFO manner?
How is it possible for a Kafka producer to retain exactly one semantics?
How can you create a Kafka topic?
How do you start a single Kafka Broker?
How to balance loads in Kafka when one server fails?
How to start a Kafka server?
How is Kafka used as a stream processing?
How are Kafka Topic partitions distributed in a Kafka cluster?
How is Kafka used as a storage system?
How do you send messages to a Kafka topic using Kafka command line client?
How are the messages consumed by a consumer in Kafka?
How can you justify the writer architecture?
How you can get exactly once messaging from Kafka during data production?
How does Kafka provide fault tolerance?
How does Kafka producer write data to a topic containing multiple partitions?
How to Tune Kafka for Optimal Performance.
How do you define a Partitioning Key?
How does The process of Assigning partitions to broker Work?
How to configure Kafka to ensure that events are stored reliably?
How to rebalance the Kafka cluster?
How to build a Spark streaming application that consumes data from Kafka?
How Kafka communicate with clients and servers?
How can you configure the Log Cleaner?
How can you create Topic in Kafka?
How is the Kafka messaging system different from other messaging framework?
How producer works in the Kafka?
How can Kafka producer maintain exactly once semantics?
How Apache Kafka is different then rabbitMQ?
How do we start the Kafka server?
How do we achieve FIFO behaviour in Kafka?
How do we send large messages with Kafka?
How Kafka fit in microservices architecture?
How is multi-tenancy achieved in Kafka?
How do we design consumer groups in Kafka for high throughput?
When does QueueFullException occur in the Producer API?
When does Kafka throw a BufferExhaustedException?
When do you call the cleanup method?
When does the queue full exception emerge inside the manufacturer?
When not to use Apache Kafka?
Where is the meta-information about topics stored in the Kafka cluster?
Where else is the ZooKeeper used?
Where does the meta information about Topics stored in a Kafka Cluster?
Where does Kafka maintain offset?
Difference - Partition vs replica
In the Producer, when does QueueFullException occur?
In a consumer group, what is the process of assigning a partition to a particular consumer?
Topic Content
Messaging System
  • Point to Point Messaging System
    • In a point-to-point system, messages are persisted in a queue
    • One or more consumers can consume the messages in the queue, but a particular message can be consumed by a maximum of one consumer only
    • Once a consumer reads a message in the queue, it disappears from that queue
    • The typical example of this system is an Order Processing System, where each order will be processed by one Order Processor, but Multiple Order Processors can work as well at the same time.
  • Publish-Subscribe Messaging System
Apache Kafka
  • Apache Kafka is a software platform which is based on a distributed streaming process
  • It is a publish-subscribe messaging system which let exchanging of data between applications, servers
  • It was originally developed by LinkedIn, and later it was donated to the Apache Software Foundation
  • It is capable of handling millions of data or messages per second.
  • It works as a mediator between the source system and the target system. Thus, the source system (producer) data is sent to the Apache Kafka, where it decouples the data, and the target system (consumer) consumes the data from Kafka.
  • Apache Kafka is able to maintain the fault-tolerance. Fault-tolerance means that sometimes a consumer successfully consumes the message that was delivered by the producer. But, the consumer fails to process the message back due to backend database failure, or due to presence of a bug in the consumer code. In such a situation, the consumer is unable to consume the message again. Consequently, Apache Kafka has resolved the problem by reprocessing the data.
Streaming process
  • A streaming process is the processing of data in parallelly connected systems.
  • This process allows different applications to limit the parallel execution of the data, where one record executes without waiting for the output of the previous record.
  • As soon as the streams of records occur, it processes it.
  • It stores the streams of records in a fault-tolerant durable way.
Core-API
  • Producer API - This API allows/permits an application to publish streams of records to one or more topics.
  • Consumer API - This API allows an application to subscribe one or more topics and process the stream of records produced to them.
  • Streams API - This API allows an application to effectively transform the input streams to the output streams. It permits an application to act as a stream processor which consumes an input stream from one or more topics, and produce an output stream to one or more output topics.
  • Connector API - This API executes the reusable producer and consumer APIs with the existing data systems or applications.
Topics
  • A stream of messages belonging to a particular category is called a topic. Data is stored in topics.
  • Topic replication factor
Partition
  • Topics may have many partitions, so it can handle an arbitrary amount of data.
  • Partition offset - Each partitioned message has a unique sequence id called as offset.
  • Replicas of partition - Replicas are nothing but backups of a partition. Replicas are never read or write data. They are used to prevent data loss.
Brokers
  • Brokers are simple system responsible for maintaining the pub-lished data.
  • Each broker may have zero or more partitions per topic.
  • Assume, if there are N partitions in a topic and N number of brokers, each broker will have one partition.
Kafka Cluster
  • Kafka's having more than one broker are called as Kafka cluster.
  • A Kafka cluster can be expanded without downtime.
  • These clusters are used to manage the persistence and replication of message data.
Producers
  • Producers are the publisher of messages to one or more Kafka topics.
  • It send data to Kafka brokers.
  • Every time a producer publishes a message to a broker, the broker simply appends the message to the last segment file.
  • Producer can also send messages to a partition of their choice.
Consumer
  • It read data from brokers.
  • It subscribes to one or more topics and consume published messages by pulling data from the brokers.
  • Consumer Group
Leader
  • Leader is the node responsible for all reads and writes for the given partition.
  • Every partition has one server acting as a leader.
Follower
  • Node which follows leader instructions are called as follower.
  • If the leader fails, one of the follower will automatically become the new leader.
  • A follower acts as normal consumer, pulls messages and up-dates its own data store.
Zookeeper

2 comments:

Most views on this month