26 January 2026
19 January 2026
#Amazon EC2
Last updated - V7 (19-Jan-2026)
Key Concepts
| S.No | Topic | Sub-Topics |
|---|
Interview question
Related Topics
#Amazon RDS
Last updated - V7 (19-Jan-2026)
Key Concepts
| S.No | Topic | Sub-Topics |
|---|
Interview question
Related Topics
#Amazon EMR
Last updated - V7 (19-Jan-2026)
Key Concepts
| S.No | Topic | Sub-Topics |
|---|
Interview question
Related Topics
#AWS Lambda
Last updated - V7 (19-Jan-2026)
Key Concepts
| S.No | Topic | Sub-Topics |
|---|
Interview question
Related Topics
#Amazon EKS / ECS
Last updated - V7 (19-Jan-2026)
Key Concepts
| S.No | Topic | Sub-Topics |
|---|
Interview question
Related Topics
#Amazon SageMaker
Last updated - V7 (19-Jan-2026)
Key Concepts
| S.No | Topic | Sub-Topics |
|---|
Interview question
Related Topics
#Spark Core
Last updated - V7 (19-Jan-2026)
Key Concepts
| S.No | Topic | Sub-Topics |
|---|
Interview question
Related Topics
| SparkContext |
| Components |
| DAG |
18 January 2026
#PySpark
Last updated - V7 (19-Jan-2026)
Key Concepts
| S.No | Topic | Sub-Topics |
|---|---|---|
| 1 | PySpark | What is PySpark, Spark ecosystem, PySpark vs Pandas, Use cases, Installation & setup |
| 2 | Spark Architecture | Driver, Executors, Cluster manager, Jobs/Stages/Tasks, Execution flow |
| 3 | SparkSession & Context | SparkSession, SparkContext, Configurations, Application lifecycle, Best practices |
| 4 | RDD Fundamentals | RDD creation, Transformations, Actions, Persistence, RDD vs DataFrame |
| 5 | RDD Advanced Operations | Narrow vs wide ops, shuffle, Accumulators, Broadcast variables, Performance tuning |
| 6 | DataFrame Introduction | DataFrame API, Creating DataFrames, Schema inference, show/select, DataFrame vs RDD |
| 7 | DataFrame Transformations | select, filter, withColumn, drop, cast & rename |
| 8 | Data Sources & Formats | CSV, JSON, Parquet, ORC, Avro |
| 9 | Schema Management | StructType, StructField, Explicit schema, Schema evolution, Corrupt records |
| 10 | Built-in Functions | String functions, Date functions, Math functions, Conditional logic, Null handling |
| 11 | Joins in PySpark | Inner join, Left/Right join, Full join, Broadcast join, Join optimization |
| 12 | Aggregations | groupBy, agg, count/sum/avg, rollup, cube |
| 13 | Window Functions | Window spec, row_number, rank/dense_rank, lead/lag, Running totals |
| 14 | Sorting & Partitioning | orderBy, sortWithinPartitions, repartition, coalesce, Data skew basics |
| 15 | Spark SQL | Temp views, Global views, SQL queries, CTEs, SQL vs DataFrame API |
| 16 | User Defined Functions | Python UDF, Pandas UDF, Serialization cost, When to avoid UDF, Alternatives |
| 17 | Performance Optimization | Caching, Persist levels, Broadcast joins, File sizing, Best practices |
| 18 | Partition & File Optimization | Partition pruning, Bucketing, Small file problem, Compression, Skew handling |
| 19 | PySpark with Hive | Hive metastore, Managed tables, External tables, Partitioned tables, Hive SQL |
| 20 | Structured Streaming Basics | Streaming concepts, Micro-batching, Sources, Sinks, Checkpointing |
| 21 | Streaming Operations | Triggers, Output modes, Watermarking, Late data, Fault tolerance |
| 22 | Streaming Aggregations | Windowed aggregation, Stateful ops, Stream joins, Exactly-once semantics, Recovery |
| 23 | MLlib Overview | Transformers, Estimators, Pipelines, Evaluators, Model lifecycle |
| 24 | Feature Engineering | StringIndexer, OneHotEncoder, VectorAssembler, Scaling, Feature selection |
| 25 | ML Algorithms | Regression, Classification, Clustering, Recommendation, Metrics |
| 26 | Hyperparameter Tuning | CrossValidator, Train-validation split, ParamGrid, Model selection, Optimization |
| 27 | PySpark with Delta Lake | Delta tables, ACID transactions, Time travel, MERGE, Optimize & Vacuum |
| 28 | Debugging & Monitoring | Spark UI, Logs, Common errors, Debug strategies, Job analysis |
| 29 | Job Scheduling & Deployment | spark-submit, Config tuning, Scheduling, Parameterization, Automation |
| 30 | Real-world Use Cases | ETL pipelines, Streaming analytics, ML pipelines, Optimization patterns, Interview prep |
Interview question
Related Topics
#Databricks
Last updated - V7 (19-Jan-2026)
Key Concepts
| S.No | Topic | Sub-Topics |
|---|---|---|
| 1 | Databricks | What is Databricks, Lakehouse concept, Databricks vs Hadoop, Use cases, Architecture overview |
| 2 | Databricks Workspace | Workspace UI, Notebooks, Clusters, Jobs, Repos |
| 3 | Databricks Architecture | Control plane, Data plane, Workspace components, Security layers, Execution flow |
| 4 | Clusters in Databricks | All-purpose clusters, Job clusters, Autoscaling, Cluster policies, Init scripts |
| 5 | Databricks Runtime | DBR versions, Photon engine, ML runtime, GPU runtime, Performance tuning |
| 6 | Notebooks | Languages supported, Notebook workflows, Magic commands, Versioning, Collaboration |
| 7 | Databricks Utilities (dbutils) | File system ops, Secrets, Widgets, Notebook workflows, FS mounts |
| 8 | Data Ingestion | Batch ingestion, Streaming ingestion, Auto Loader, File formats, Schema inference |
| 9 | Delta Lake Fundamentals | ACID transactions, Delta log, Schema enforcement, Time travel, File compaction |
| 10 | Delta Lake Advanced | OPTIMIZE, Z-ORDER, Vacuum, Delta constraints, Change Data Feed |
| 11 | Spark SQL in Databricks | SQL editor, ANSI SQL, Views, CTEs, Query optimization |
| 12 | DataFrames & Datasets | API overview, Transformations, Actions, Lazy evaluation, Performance tips |
| 13 | Databricks SQL Warehouses | Serverless SQL, Query execution, Dashboards, Alerts, Access control |
| 14 | Jobs & Workflows | Job types, Task dependencies, Scheduling, Retries, Monitoring |
| 15 | Databricks Repos | Git integration, Branching, CI/CD basics, Repo permissions, Best practices |
| 16 | Security & Access Control | Users & groups, IAM integration, Table ACLs, Cluster policies, Secrets |
| 17 | Unity Catalog | Metastore, Catalogs & schemas, Data lineage, Fine-grained access, Auditing |
| 18 | Streaming with Databricks | Structured Streaming, Triggers, Watermarking, Stateful ops, Fault tolerance |
| 19 | Auto Loader | CloudFiles, Incremental ingestion, Schema evolution, Notifications, Performance tuning |
| 20 | Databricks ML Overview | ML workspace, ML runtime, Experiment tracking, Feature store, Model registry |
| 21 | MLflow in Databricks | Tracking, Projects, Models, Model registry, Deployment |
| 22 | Feature Store | Feature tables, Offline features, Online features, Reusability, Governance |
| 23 | Model Training | Distributed training, Hyperparameter tuning, AutoML, GPUs, Evaluation metrics |
| 24 | Model Deployment | Batch inference, Real-time serving, Model endpoints, A/B testing, Monitoring |
| 25 | Performance Optimization | Partitioning, Caching, Broadcast joins, Skew handling, Photon usage |
| 26 | Monitoring & Logging | Spark UI, Ganglia, Job metrics, Logs, Alerts |
| 27 | Cost Optimization | Cluster sizing, Spot instances, Autoscaling, Job clusters, Usage reports |
| 28 | Databricks on Cloud | AWS architecture, Azure architecture, GCP basics, Networking, Storage integration |
| 29 | CI/CD & DevOps | Repos + pipelines, Databricks CLI, Asset bundles, Environment promotion, Automation |
| 30 | Real-world Use Cases | ETL pipelines, Streaming analytics, ML pipelines, Lakehouse design, Interview prep |
Interview question
Related Topics
14 January 2026
#Joins & Aggregations
Last updated - V7 (14-Jan-2026)
Key Concepts
| S.No | Topic | Sub-Topics |
|---|---|---|
| 1 | Joins | What is a join, Types of joins, Importance, Examples, Use cases |
| 2 | Inner Join | Definition, Syntax, Example with RDD, Example with DataFrame, Performance considerations |
| 3 | Left Outer Join | Definition, Syntax, Example RDD, Example DataFrame, Handling nulls |
| 4 | Right Outer Join | Definition, Syntax, Example RDD, Example DataFrame, Use cases |
| 5 | Full Outer Join | Definition, Syntax, Example RDD, Example DataFrame, Null handling |
| 6 | Cross Join / Cartesian | Definition, Syntax, Example, Performance considerations, Use cases |
| 7 | Self Join | Definition, Syntax, Example RDD, Example DataFrame, Use cases |
| 8 | Broadcast Join | Definition, When to use, Example, Performance benefits, Spark configuration |
| 9 | Skewed Joins | Definition, Problems caused, Solutions, Salting technique, Performance tips |
| 10 | Join on Multiple Columns | Syntax, Example DataFrame, Example SQL, Performance considerations, Best practices |
| 11 | Key Considerations in Joins | Partitioning, Shuffling, Data size, Broadcast, Caching |
| 12 | Aggregation Overview | What is aggregation, Types, Importance, Syntax, Use cases |
| 13 | GroupBy | Definition, Syntax, Example RDD, Example DataFrame, Performance considerations |
| 14 | GroupByKey vs ReduceByKey | Definition, Syntax, Performance difference, Example, When to use |
| 15 | AggregateByKey | Definition, Syntax, Example, Custom aggregation functions, Performance |
| 16 | CountByKey & CountByValue | Definition, Syntax, Example RDD, Example DataFrame, Use cases |
| 17 | Sum, Max, Min Aggregations | Syntax, Example DataFrame, Example SQL, Performance, Best practices |
| 18 | Average & Mean Aggregations | Syntax, Example RDD, Example DataFrame, Handling nulls, Performance |
| 19 | Multiple Aggregations | agg() function, Syntax, Example DataFrame, Example SQL, Performance tips |
| 20 | Window Functions for Aggregation | Definition, Syntax, PartitionBy, OrderBy, Example |
| 21 | Rollup & Cube | Definition, Syntax, Example DataFrame, Use cases, Performance tips |
| 22 | Pivot Aggregations | Definition, Syntax, Example DataFrame, Example SQL, Use cases |
| 23 | Approximate Aggregations | approxCountDistinct(), approxQuantile(), Use cases, Syntax, Performance benefits |
| 24 | Custom Aggregations | User-defined aggregate functions (UDAF), Syntax, Example, Use cases, Performance tips |
| 25 | Combining Joins & Aggregations | Join then aggregate, Aggregate then join, Example DataFrame, SQL example, Best practices |
| 26 | Handling Nulls in Joins & Aggregations | Null handling functions, coalesce(), fill(), drop(), Example, Best practices |
| 27 | Optimizing Joins | Broadcast join, Partitioning, Caching, Skew handling, Shuffle reduction |
| 28 | Optimizing Aggregations | Partitioning, ReduceByKey, AggregateByKey, Caching, Avoid groupByKey for large data |
| 29 | Advanced Aggregation Techniques | Window functions, Rollup, Cube, Pivot, Custom UDAFs |
| 30 | Real-world Examples | ETL pipelines, Log analytics, Sales aggregation, Customer behavior analysis, Recommendations |
Interview question
Related Topics
#Spark Streaming
Last updated - V7 (14-Jan-2026)
Key Concepts
| S.No | Topic | Sub-Topics |
|---|---|---|
| 1 | Spark Streaming | What is Spark Streaming, Real-time data, Micro-batch processing, Advantages, Use cases |
| 2 | Spark Streaming Architecture | Driver, Receiver, DStream, Scheduler, Executors |
| 3 | DStream Basics | Definition, Creation, Operations, RDDs, Transformations |
| 4 | Creating DStreams | From sources: Kafka, Flume, TCP sockets, File streams, Custom receivers |
| 5 | Transformations on DStreams | map(), flatMap(), filter(), reduceByKey(), window() |
| 6 | Window Operations | window(), slideDuration, reduceByKeyAndWindow(), aggregateByKeyAndWindow(), Examples |
| 7 | Stateful Transformations | updateStateByKey(), mapWithState(), Example, Use cases, Performance |
| 8 | Actions on DStreams | print(), count(), saveAsTextFiles(), foreachRDD(), Examples |
| 9 | Data Sources Integration | Kafka, Flume, HDFS, Socket, Custom sources |
| 10 | Sinks / Output Operations | print(), saveAsTextFiles(), saveAsObjectFiles(), foreachRDD(), write to DB |
| 11 | Checkpointing | Definition, Directory setup, Purpose, Examples, Fault tolerance |
| 12 | Receiver Types | Reliable receiver, Unreliable receiver, Custom receiver, Receiver lifecycle, Examples |
| 13 | Transformations: map vs flatMap | map(), flatMap(), Use cases, Examples, Differences |
| 14 | Transformations: reduceByKey | reduceByKey(), reduceByKeyAndWindow(), Examples, Use cases, Performance |
| 15 | Transformations: join in streaming | join(), leftOuterJoin(), rightOuterJoin(), fullOuterJoin(), Example |
| 16 | Transformations: union & transform | union(), transform(), Example, Use cases, Combining multiple streams |
| 17 | Handling Late Data | Watermarks, Window operations, State management, dropLateData(), Examples |
| 18 | Kafka Integration | DirectStream vs ReceiverStream, Kafka parameters, Offset management, Example, Best practices |
| 19 | Flume Integration | Spark Streaming + Flume, Push vs Pull, Receiver setup, Example, Best practices |
| 20 | File Stream Source | HDFS integration, Local files, Monitoring new files, Examples, Performance considerations |
| 21 | Structured Streaming Introduction | Differences from DStream, High-level API, DataFrames & Datasets, Fault-tolerance, Example |
| 22 | Structured Streaming Sources | Kafka, File, Socket, Rate source, Custom sources |
| 23 | Structured Streaming Sinks | Console, File, Kafka, ForeachBatch, Memory |
| 24 | Event Time & Watermarks | Definition, Handling late data, withWatermark(), Examples, Use cases |
| 25 | Window Operations in Structured Streaming | window(), slideDuration, groupBy window(), Examples, Performance tips |
| 26 | Stateful Operations in Structured Streaming | mapGroupsWithState(), flatMapGroupsWithState(), Examples, Use cases, Performance |
| 27 | Performance Tuning | Batch interval, Partitioning, Backpressure, Checkpointing, Resource tuning |
| 28 | Fault Tolerance & Reliability | Checkpointing, Write-ahead logs, Replay, Receiver reliability, Structured Streaming guarantees |
| 29 | Monitoring & Debugging | Spark UI, Streaming metrics, Logs, Executor monitoring, Performance tuning |
| 30 | Real-world Examples | Log analytics, IoT data processing, Real-time dashboards, Clickstream analysis, Recommendations |
Interview question
Related Topics
#Transformations & Actions
Last updated - V7 (14-Jan-2026)
Key Concepts
| S.No | Topic | Sub-Topics |
|---|---|---|
| 1 | Transformations & Actions | Definition, Lazy Evaluation, DAG concept, Execution flow, Why separation matters |
| 2 | Narrow vs Wide Transformations | Definition, Examples, Shuffle impact, Performance difference, Use cases |
| 3 | map() | Syntax, One-to-one mapping, Use cases, Performance, Examples |
| 4 | flatMap() | One-to-many mapping, Differences from map, Use cases, Examples, Performance |
| 5 | filter() | Predicate logic, Data reduction, Optimization tips, Examples, Use cases |
| 6 | select() / withColumn() | Column selection, Column creation, Expressions, Performance tips, Examples |
| 7 | union() & distinct() | Combining datasets, Removing duplicates, Shuffle behavior, Use cases, Examples |
| 8 | groupBy() | Grouping logic, Aggregation basics, Shuffle impact, Examples, Best practices |
| 9 | reduceByKey() | Key-based reduction, Map-side aggregation, Performance benefits, Examples, Comparison |
| 10 | groupByKey() | Working principle, Memory impact, Comparison with reduceByKey, Examples, When to avoid |
| 11 | sortBy() & orderBy() | Sorting logic, Asc/Desc order, Shuffle cost, Examples, Optimization tips |
| 12 | join() Basics | Inner join, Join condition, Execution flow, Examples, Common issues |
| 13 | Advanced Join Types | Left, Right, Full, Semi, Anti joins, Use cases, Examples |
| 14 | Broadcast Join | Concept, When to use, Memory impact, SQL hint, Examples |
| 15 | repartition() & coalesce() | Partition control, Shuffle behavior, Performance impact, Use cases, Examples |
| 16 | cache() & persist() | Storage levels, Memory vs disk, When to cache, Examples, Pitfalls |
| 17 | count() | Action trigger, Job creation, Performance considerations, Examples, Use cases |
| 18 | collect() | Driver memory risk, Small data usage, Examples, Best practices, Alternatives |
| 19 | show() & take() | Preview data, Execution behavior, Limit handling, Examples, Usage tips |
| 20 | save() & write() | Output formats, File systems, Partition output, Modes, Examples |
| 21 | foreach() & foreachPartition() | Side effects, External systems, Performance difference, Examples, Best practices |
| 22 | Window Functions | Over clause, Partition by, Order by, Use cases, Examples |
| 23 | Actions vs Transformations | Comparison, Execution timing, DAG role, Interview questions, Examples |
| 24 | Shuffle Internals | When shuffle occurs, Cost factors, Optimization, Examples, Debugging |
| 25 | Performance Optimization | Avoid wide ops, Partition sizing, Caching strategy, Examples, Tips |
| 26 | Error Handling | Bad records, Null handling, Try-catch logic, Data validation, Examples |
| 27 | Spark UI Analysis | Jobs tab, Stages tab, Task metrics, Shuffle read/write, Debugging |
| 28 | Real-world ETL Flow | Transform chain design, Action placement, Optimization, Examples, Best practices |
| 29 | Interview Scenarios | Common questions, Tricky cases, Performance questions, Sample answers, Tips |
| 30 | Hands-on Mini Project | End-to-end pipeline, Transformations usage, Actions usage, Optimization, Review |
Interview question
Related Topics
#DataFrames & Datasets
Last updated - V7 (14-Jan-2026)
Key Concepts
| S.No | Topic | Sub-Topics |
|---|---|---|
| 1 | Apache Spark & DataFrames | Spark overview, RDD vs DataFrame, Spark architecture, Lazy evaluation, Use cases |
| 2 | Spark Setup & Environment | Local mode, Cluster mode, SparkSession, spark-submit, Configuration basics |
| 3 | SparkSession & Entry Points | SparkSession creation, SQLContext, HiveContext, Config options, Best practices |
| 4 | Creating DataFrames | From files, From RDD, From collections, Schema inference, Explicit schema |
| 5 | DataFrame Schema & Data Types | StructType, StructField, Primitive types, Complex types, Schema evolution |
| 6 | Reading Data Sources | CSV, JSON, Parquet, ORC, Avro basics |
| 7 | Writing DataFrames | Save modes, Partitioning, Bucketing, File formats, Compression |
| 8 | DataFrame Basic Operations | select, withColumn, drop, filter, where |
| 9 | Column Operations | Column expressions, alias, cast, when/otherwise, lit |
| 10 | Row Operations & Actions | show, collect, take, count, first |
| 11 | DataFrame Functions | Built-in functions, String functions, Date functions, Math functions, Null handling |
| 12 | Filtering & Conditional Logic | filter vs where, isin, like, rlike, case when |
| 13 | Sorting & Deduplication | orderBy, sort, distinct, dropDuplicates, Sorting optimization |
| 14 | Aggregation & Grouping | groupBy, agg, count, sum, avg |
| 15 | Joins in DataFrames | Inner join, Left/Right join, Full join, Semi/Anti join |
| 16 | Join Optimization | Broadcast join, Shuffle join, Join hints, Skew handling, AQE |
| 17 | Handling Missing & Bad Data | dropna, fillna, replace, Null checks, Data validation |
| 18 | Window Functions | Window spec, row_number, rank, lead/lag, Running totals |
| 19 | UDF & UDAF | UDF creation, Performance impact, Pandas UDF, Serialization, Best practices |
| 20 | DataFrame Caching & Persistence | cache, persist, Storage levels, Memory vs disk, When to cache |
| 21 | Spark SQL with DataFrames | Temp views, Global views, SQL queries, Mixing SQL & DF, Optimization |
| 22 | Partitioning & Repartitioning | repartition, coalesce, Partition pruning, File partitioning, Performance tuning |
| 23 | Performance Optimization Basics | Catalyst optimizer, Tungsten, Predicate pushdown, Column pruning, AQE |
| 24 | DataFrame Execution Plan | Logical plan, Physical plan, explain(), DAG, Stage breakdown |
| 25 | Handling Large Datasets | Skew issues, Sampling, Checkpointing, Memory tuning, Spill handling |
| 26 | Integration with Hive | Hive tables, External tables, Metastore, Partitioned tables, Hive SQL |
| 27 | Streaming DataFrames (Structured Streaming) | Streaming sources, Sinks, Watermarking, Windowed aggregations, Triggers |
| 28 | Error Handling & Debugging | Common errors, Serialization issues, Logging, Debug tools, Retry strategies |
| 29 | Best Practices & Design Patterns | Code structure, Reusability, Performance patterns, Anti-patterns, Testing |
| 30 | Real-world Use Cases & Projects | ETL pipelines, Data lake processing, Analytics workloads, Reporting, Optimization review |
Interview question
Related Topics
12 January 2026
#JUnit
Last updated - V7 (12-Jan-2026)
Key Concepts
| S.No | Topic | Sub-Topics |
|---|---|---|
| 1 | Introduction to JUnit | What is JUnit?, Importance of Unit Testing, History of JUnit, Versions overview, Use cases |
| 2 | JUnit Architecture | Core classes, Test runners, Test lifecycle, Annotations overview, Test suites |
| 3 | JUnit 4 vs JUnit 5 | Key differences, Annotations, Assertions, Extension model, Migration strategies |
| 4 | JUnit Annotations | @Test, @Before, @After, @BeforeClass, @AfterClass |
| 5 | JUnit 5 Annotations | @Test, @BeforeEach, @AfterEach, @BeforeAll, @AfterAll |
| 6 | Assertions in JUnit | assertEquals, assertTrue, assertFalse, assertNotNull, assertThrows |
| 7 | Parameterized Tests | Introduction, @ParameterizedTest, @ValueSource, @CsvSource, Custom parameter providers |
| 8 | JUnit Test Suites | Purpose, Creating test suites, Including multiple classes, @Suite annotation, Running suites |
| 9 | Exception Testing | assertThrows, Expected exceptions, Handling exceptions in tests, Try-catch in tests, Best practices |
| 10 | Timeouts in Tests | Using @Test(timeout), assertTimeout, assertTimeoutPreemptively, Long-running tests, Best practices |
| 11 | Assumptions in JUnit | assumeTrue, assumeFalse, Conditional test execution, Environment-specific tests, Integration with CI |
| 12 | Test Lifecycle Methods | Setup and teardown, @BeforeEach/@AfterEach, @BeforeAll/@AfterAll, Resource management, Best practices |
| 13 | Nested Tests | Introduction, @Nested annotation, Structuring tests, Inner classes, Scope and lifecycle |
| 14 | Tagging Tests | @Tag annotation, Grouping tests, Running specific tags, Excluding tags, Integration with CI/CD |
| 15 | JUnit Extensions | Introduction, @ExtendWith annotation, Custom extensions, Parameter resolvers, Test lifecycle hooks |
| 16 | Mocking with Mockito | Mockito basics, @Mock, @InjectMocks, when-thenReturn, Verifying interactions |
| 17 | JUnit with Spring Boot | @SpringBootTest, @WebMvcTest, @MockBean, Context loading, Integration tests |
| 18 | Behavior Driven Testing | Introduction to BDD, JUnit + Cucumber, Feature files, Step definitions, Integration examples |
| 19 | Testing Exceptions and Edge Cases | Edge case identification, Boundary testing, assertThrows, Negative testing, Best practices |
| 20 | JUnit Test Reports | Generating reports, Maven Surefire plugin, Gradle reports, HTML reports, CI integration |
| 21 | Mocking Static Methods | Mockito inline, PowerMockito, Limitations, Use cases, Best practices |
| 22 | Parameterized and CSV Tests | @CsvSource, @CsvFileSource, @MethodSource, Dynamic tests, Practical examples |
| 23 | Dynamic Tests | @TestFactory, DynamicTest.stream, Custom dynamic tests, Use cases, Best practices |
| 24 | Integration Testing with JUnit | Introduction, Database tests, REST API testing, Spring integration, Environment setup |
| 25 | Code Coverage | Jacoco integration, Measuring coverage, Analyzing reports, Coverage thresholds, Best practices |
| 26 | Continuous Integration | JUnit in CI/CD, Jenkins integration, GitHub Actions, Pipeline setup, Reporting |
| 27 | Best Practices in JUnit | Writing clean tests, DRY principle, Readable assertions, Test naming conventions, Test isolation |
| 28 | Debugging Unit Tests | Using IDE debugger, Common failures, Stack traces, Logging in tests, Fixing flaky tests |
| 29 | Advanced Assertions | assertAll, assertIterableEquals, assertLinesMatch, assertTimeout, Custom assertions |
| 30 | JUnit Projects & Labs | Hands-on projects, Full coverage examples, Spring Boot testing, CI/CD integration, Practice exercises |
Interview question
Related Topics
#MultiThread
Last updated - V7 (12-Jan-2026)
Key Concepts
| S.No | Topic | Sub-Topics |
|---|---|---|
| 1 | Multithreading | What is Thread, Process vs Thread, Benefits of Multithreading, Applications, Thread Lifecycle Overview |
| 2 | Thread Class | Creating Thread by extending Thread, start(), run(), sleep(), join(), getName() |
| 3 | Runnable Interface | Implementing Runnable, Passing to Thread, Advantages, run() vs start(), Lambda Runnable |
| 4 | Thread Lifecycle | New, Runnable, Running, Waiting, Timed Waiting, Terminated, Thread State Transitions |
| 5 | Thread Methods | setName/getName, setPriority/getPriority, isAlive(), yield(), interrupt() |
| 6 | Thread Priority | Min/Max/Normal Priority, setPriority, Thread Scheduling, Preemption, Fairness |
| 7 | Thread Sleep & Join | sleep(), join(), wait vs sleep, timed join, practical examples |
| 8 | Thread Communication | wait(), notify(), notifyAll(), producer-consumer basics, synchronized block |
| 9 | Synchronized Methods | Method-level sync, block-level sync, object lock, class-level lock, best practices |
| 10 | Inter-thread Communication | Producer-Consumer Problem, BlockingQueue, wait/notify, ReentrantLock with Condition, deadlock prevention |
| 11 | Reentrant Locks | Lock interface, ReentrantLock, tryLock(), lockInterruptibly(), fairness, conditions |
| 12 | Deadlock | What is Deadlock, Conditions, Prevention, Avoidance, Detection, Recovery |
| 13 | Starvation & Livelock | Starvation, Livelock, Examples, Priority Inversion, Solutions |
| 14 | Thread Safety | Definition, Thread-safe classes, Immutable Objects, Synchronization, Atomic variables |
| 15 | Atomic Classes | AtomicInteger, AtomicLong, AtomicReference, compareAndSet, use cases |
| 16 | Volatile Keyword | What is volatile, visibility, happens-before, example usage, memory consistency |
| 17 | Concurrent Collections | ConcurrentHashMap, CopyOnWriteArrayList, BlockingQueue, ConcurrentSkipListMap, benefits |
| 18 | Executor Framework | Executor, ExecutorService, ThreadPoolExecutor, ScheduledExecutorService, shutdown |
| 19 | Thread Pools | FixedPool, CachedPool, SingleThreadPool, ScheduledPool, Advantages |
| 20 | Callable & Future | Callable Interface, Future, submit(), get(), timeout handling, cancelling tasks |
| 21 | ForkJoin Framework | ForkJoinPool, RecursiveTask, RecursiveAction, work-stealing, parallel computation |
| 22 | Parallel Streams | Stream API, parallel(), ForkJoin usage, performance tips, pitfalls |
| 23 | ThreadLocal | ThreadLocal variables, usage, memory leak, InheritableThreadLocal, examples |
| 24 | Synchronization Utilities | CountDownLatch, CyclicBarrier, Semaphore, Phaser, Exchanger |
| 25 | Deadlock Prevention Patterns | Lock Ordering, TryLock, Timeout, Avoid Nested Locks, Resource hierarchy |
| 26 | Best Practices | Minimize synchronized code, prefer high-level concurrency, immutable objects, use executor, avoid busy wait |
| 27 | Performance Tuning | Thread pool sizing, contention reduction, CPU-bound vs IO-bound, measuring, profiling |
| 28 | Common Concurrency Bugs | Race conditions, deadlocks, livelocks, visibility issues, fixes |
| 29 | Real-world Examples | Producer-Consumer app, Web server handling requests, parallel processing, async tasks, thread-safe cache |
| 30 | Interview & Revision | Key methods, concurrency concepts, common pitfalls, multithreading Q&A, mini projects |
Interview question
Related Topics
11 January 2026
#Scikit
Last updated - V7 (11-Jan-2026)
Key Concepts
| S.No | Topic | Sub-Topics |
|---|---|---|
| 1 | Scikit-learn | What is scikit-learn, Installation, Key features, ML workflow, Supported algorithms |
| 2 | Scikit-learn API Basics | Estimators, fit(), predict(), transform(), Pipelines, Model persistence |
| 3 | Data Loading & Inspection | Built-in datasets, load_*, fetch_*, Data shapes, Feature names, Target variables |
| 4 | Data Preprocessing | Scaling, Normalization, Encoding categorical data, Missing values, Feature transformation |
| 5 | Feature Scaling Techniques | StandardScaler, MinMaxScaler, RobustScaler, Normalizer, When to scale |
| 6 | Handling Missing Data | SimpleImputer, Strategies, Missing indicators, Pipeline usage, Best practices |
| 7 | Encoding Categorical Variables | LabelEncoder, OneHotEncoder, OrdinalEncoder, Handling unknowns, Sparse output |
| 8 | Train-Test Split | train_test_split, Stratification, Random state, Data leakage, Validation sets |
| 9 | Linear Regression | LinearRegression, Assumptions, Coefficients, Evaluation metrics, Use cases |
| 10 | Logistic Regression | Binary vs multiclass, Regularization, Solver options, Class weights, Evaluation |
| 11 | Model Evaluation Metrics | Accuracy, Precision, Recall, F1-score, Confusion matrix |
| 12 | Cross-Validation | K-Fold, StratifiedKFold, cross_val_score, cross_validate, Bias-variance tradeoff |
| 13 | k-Nearest Neighbors | KNN classifier, KNN regressor, Distance metrics, Choosing K, Performance |
| 14 | Support Vector Machines | SVC, SVR, Kernels, Hyperparameters, Margin maximization |
| 15 | Decision Trees | Tree structure, Gini vs entropy, Overfitting, Pruning, Feature importance |
| 16 | Ensemble Learning | Bagging, Boosting, Random Forest, Extra Trees, Voting classifiers |
| 17 | Random Forest | RandomForestClassifier, Hyperparameters, Feature importance, OOB score, Use cases |
| 18 | Gradient Boosting | GradientBoosting, XGBoost intro, LightGBM intro, Learning rate, Trees depth |
| 19 | Naive Bayes | GaussianNB, MultinomialNB, BernoulliNB, Assumptions, Applications |
| 20 | Clustering Algorithms | KMeans, Hierarchical clustering, DBSCAN, Silhouette score, Use cases |
| 21 | Dimensionality Reduction | PCA, Kernel PCA, Explained variance, Feature compression, Visualization |
| 22 | Anomaly Detection | Isolation Forest, One-Class SVM, LOF, Use cases, Evaluation challenges |
| 23 | Model Selection & Tuning | GridSearchCV, RandomizedSearchCV, Hyperparameters, Scoring, Best estimators |
| 24 | Pipelines & ColumnTransformer | Pipeline, Feature unions, ColumnTransformer, End-to-end ML, Avoid leakage |
| 25 | Imbalanced Datasets | Class imbalance, SMOTE, Class weights, Evaluation metrics, Best practices |
| 26 | Text Feature Extraction | CountVectorizer, TF-IDF, N-grams, Stop words, Sparse matrices |
| 27 | Model Persistence | joblib, pickle, Saving models, Loading models, Versioning |
| 28 | Model Interpretation | Coefficients, Feature importance, Permutation importance, Partial dependence, SHAP intro |
| 29 | Scikit-learn with Pipelines in Production | Reproducibility, Monitoring, Data drift, Model updates, Best practices |
| 30 | Scikit-learn Best Practices | Code structure, Experiment tracking, Documentation, Common pitfalls, Next steps |
Interview question
Related Topics
#Pandas
Last updated - V7 (11-Jan-2026)
Key Concepts
| S.No | Topic | Sub-Topics |
|---|---|---|
| 1 | Pandas | Overview, Installation, Series, DataFrame, Basic operations |
| 2 | Series Basics | Creating Series, Indexing, Slicing, Series methods, Data types |
| 3 | DataFrame Basics | Create DataFrame, Index/Columns, Shape, dtypes, head/tail |
| 4 | Data Selection | loc, iloc, ix, column selection, row selection |
| 5 | Data Filtering | Boolean indexing, conditions, isin, between, query() |
| 6 | Missing Data | isnull, notnull, fillna, dropna, interpolation |
| 7 | Data Cleaning | Duplicates, rename, replace, strip whitespaces, type conversion |
| 8 | Data Transformation | apply, map, applymap, lambda functions, vectorized operations |
| 9 | Aggregation & Grouping | groupby, aggregate, transform, filter, pivot tables |
| 10 | Sorting & Ranking | sort_values, sort_index, rank, ascending/descending, multi-level sorting |
| 11 | Indexing & MultiIndex | set_index, reset_index, hierarchical index, slicing, cross-section |
| 12 | Concatenation & Merging | concat, append, merge, join, indicator |
| 13 | Reshaping Data | melt, pivot, stack, unstack, wide to long format |
| 14 | Time Series Basics | Datetime conversion, date_range, indexing, resampling, frequency |
| 15 | Time Series Advanced | rolling, expanding, shifting, lag/lead, moving average |
| 16 | String Operations | str methods, contains, replace, split, regex |
| 17 | Visualization with Pandas | plot, line, bar, histogram, scatter |
| 18 | Reading/Writing Data | read_csv, read_excel, read_json, to_csv, to_excel |
| 19 | Advanced I/O | read_sql, read_parquet, read_hdf, read_pickle, compression |
| 20 | Exploratory Data Analysis | describe, info, value_counts, correlation, unique |
| 21 | Multi-Column Operations | arithmetic, apply, assign, lambda, broadcasting |
| 22 | Window Functions | rolling, expanding, ewm, groupby with window, custom functions |
| 23 | Categorical Data | category dtype, conversion, codes, sorting, filtering |
| 24 | Sampling & Subsetting | sample, head/tail, nth, slicing, random sampling |
| 25 | Performance Optimization | vectorization, eval/query, categorical, chunking, memory usage |
| 26 | MultiIndex Advanced | stack/unstack, xs, swaplevel, sortlevel, indexing tricks |
| 27 | Custom Functions | apply, pipe, lambda, function chaining, reusable utilities |
| 28 | Integration with NumPy & SciPy | array operations, broadcasting, linear algebra, statistical functions, interoperability |
| 29 | Real World Data Projects | EDA, cleaning, aggregation, visualization, export results |
| 30 | End-to-End Project | Data collection, cleaning, analysis, feature engineering, visualization |
Interview question
Related Topics
Subscribe to:
Comments (Atom)