| S.No |
Topic |
Sub-Topics |
| 1 | Spark Streaming | What is Spark Streaming, Real-time data, Micro-batch processing, Advantages, Use cases |
| 2 | Spark Streaming Architecture | Driver, Receiver, DStream, Scheduler, Executors |
| 3 | DStream Basics | Definition, Creation, Operations, RDDs, Transformations |
| 4 | Creating DStreams | From sources: Kafka, Flume, TCP sockets, File streams, Custom receivers |
| 5 | Transformations on DStreams | map(), flatMap(), filter(), reduceByKey(), window() |
| 6 | Window Operations | window(), slideDuration, reduceByKeyAndWindow(), aggregateByKeyAndWindow(), Examples |
| 7 | Stateful Transformations | updateStateByKey(), mapWithState(), Example, Use cases, Performance |
| 8 | Actions on DStreams | print(), count(), saveAsTextFiles(), foreachRDD(), Examples |
| 9 | Data Sources Integration | Kafka, Flume, HDFS, Socket, Custom sources |
| 10 | Sinks / Output Operations | print(), saveAsTextFiles(), saveAsObjectFiles(), foreachRDD(), write to DB |
| 11 | Checkpointing | Definition, Directory setup, Purpose, Examples, Fault tolerance |
| 12 | Receiver Types | Reliable receiver, Unreliable receiver, Custom receiver, Receiver lifecycle, Examples |
| 13 | Transformations: map vs flatMap | map(), flatMap(), Use cases, Examples, Differences |
| 14 | Transformations: reduceByKey | reduceByKey(), reduceByKeyAndWindow(), Examples, Use cases, Performance |
| 15 | Transformations: join in streaming | join(), leftOuterJoin(), rightOuterJoin(), fullOuterJoin(), Example |
| 16 | Transformations: union & transform | union(), transform(), Example, Use cases, Combining multiple streams |
| 17 | Handling Late Data | Watermarks, Window operations, State management, dropLateData(), Examples |
| 18 | Kafka Integration | DirectStream vs ReceiverStream, Kafka parameters, Offset management, Example, Best practices |
| 19 | Flume Integration | Spark Streaming + Flume, Push vs Pull, Receiver setup, Example, Best practices |
| 20 | File Stream Source | HDFS integration, Local files, Monitoring new files, Examples, Performance considerations |
| 21 | Structured Streaming Introduction | Differences from DStream, High-level API, DataFrames & Datasets, Fault-tolerance, Example |
| 22 | Structured Streaming Sources | Kafka, File, Socket, Rate source, Custom sources |
| 23 | Structured Streaming Sinks | Console, File, Kafka, ForeachBatch, Memory |
| 24 | Event Time & Watermarks | Definition, Handling late data, withWatermark(), Examples, Use cases |
| 25 | Window Operations in Structured Streaming | window(), slideDuration, groupBy window(), Examples, Performance tips |
| 26 | Stateful Operations in Structured Streaming | mapGroupsWithState(), flatMapGroupsWithState(), Examples, Use cases, Performance |
| 27 | Performance Tuning | Batch interval, Partitioning, Backpressure, Checkpointing, Resource tuning |
| 28 | Fault Tolerance & Reliability | Checkpointing, Write-ahead logs, Replay, Receiver reliability, Structured Streaming guarantees |
| 29 | Monitoring & Debugging | Spark UI, Streaming metrics, Logs, Executor monitoring, Performance tuning |
| 30 | Real-world Examples | Log analytics, IoT data processing, Real-time dashboards, Clickstream analysis, Recommendations |