| S.No |
Topic |
Sub-Topics |
| 1 | Databricks | What is Databricks, Lakehouse concept, Databricks vs Hadoop, Use cases, Architecture overview |
| 2 | Databricks Workspace | Workspace UI, Notebooks, Clusters, Jobs, Repos |
| 3 | Databricks Architecture | Control plane, Data plane, Workspace components, Security layers, Execution flow |
| 4 | Clusters in Databricks | All-purpose clusters, Job clusters, Autoscaling, Cluster policies, Init scripts |
| 5 | Databricks Runtime | DBR versions, Photon engine, ML runtime, GPU runtime, Performance tuning |
| 6 | Notebooks | Languages supported, Notebook workflows, Magic commands, Versioning, Collaboration |
| 7 | Databricks Utilities (dbutils) | File system ops, Secrets, Widgets, Notebook workflows, FS mounts |
| 8 | Data Ingestion | Batch ingestion, Streaming ingestion, Auto Loader, File formats, Schema inference |
| 9 | Delta Lake Fundamentals | ACID transactions, Delta log, Schema enforcement, Time travel, File compaction |
| 10 | Delta Lake Advanced | OPTIMIZE, Z-ORDER, Vacuum, Delta constraints, Change Data Feed |
| 11 | Spark SQL in Databricks | SQL editor, ANSI SQL, Views, CTEs, Query optimization |
| 12 | DataFrames & Datasets | API overview, Transformations, Actions, Lazy evaluation, Performance tips |
| 13 | Databricks SQL Warehouses | Serverless SQL, Query execution, Dashboards, Alerts, Access control |
| 14 | Jobs & Workflows | Job types, Task dependencies, Scheduling, Retries, Monitoring |
| 15 | Databricks Repos | Git integration, Branching, CI/CD basics, Repo permissions, Best practices |
| 16 | Security & Access Control | Users & groups, IAM integration, Table ACLs, Cluster policies, Secrets |
| 17 | Unity Catalog | Metastore, Catalogs & schemas, Data lineage, Fine-grained access, Auditing |
| 18 | Streaming with Databricks | Structured Streaming, Triggers, Watermarking, Stateful ops, Fault tolerance |
| 19 | Auto Loader | CloudFiles, Incremental ingestion, Schema evolution, Notifications, Performance tuning |
| 20 | Databricks ML Overview | ML workspace, ML runtime, Experiment tracking, Feature store, Model registry |
| 21 | MLflow in Databricks | Tracking, Projects, Models, Model registry, Deployment |
| 22 | Feature Store | Feature tables, Offline features, Online features, Reusability, Governance |
| 23 | Model Training | Distributed training, Hyperparameter tuning, AutoML, GPUs, Evaluation metrics |
| 24 | Model Deployment | Batch inference, Real-time serving, Model endpoints, A/B testing, Monitoring |
| 25 | Performance Optimization | Partitioning, Caching, Broadcast joins, Skew handling, Photon usage |
| 26 | Monitoring & Logging | Spark UI, Ganglia, Job metrics, Logs, Alerts |
| 27 | Cost Optimization | Cluster sizing, Spot instances, Autoscaling, Job clusters, Usage reports |
| 28 | Databricks on Cloud | AWS architecture, Azure architecture, GCP basics, Networking, Storage integration |
| 29 | CI/CD & DevOps | Repos + pipelines, Databricks CLI, Asset bundles, Environment promotion, Automation |
| 30 | Real-world Use Cases | ETL pipelines, Streaming analytics, ML pipelines, Lakehouse design, Interview prep |