| Python for Data Science |
NumPy, Pandas, Matplotlib, Seaborn, SciPy |
✔️ |
✔️ |
✔️ |
|
| Statistics & Probability |
Descriptive Stats, Inferential Stats, Probability Dist., Hypothesis Testing, Sampling |
✔️ |
✔️ |
✔️ |
✔️ |
| Machine Learning |
Supervised, Unsupervised, Model Evaluation, Regularization, Feature Engineering |
✔️ |
✔️ |
✔️ |
✔️ |
| Deep Learning |
Neural Networks, CNN, RNN, LSTM, Attention |
|
✔️ |
✔️ |
✔️ |
| Data Wrangling |
Cleaning, Missing Values, Outliers, Encoding, Transformation |
✔️ |
✔️ |
✔️ |
|
| Data Visualization |
Matplotlib, Seaborn, Plotly, Dash, Storytelling |
✔️ |
✔️ |
✔️ |
✔️ |
| Big Data Technologies |
Hadoop, Spark, Hive, HDFS, Streaming |
|
✔️ |
✔️ |
✔️ |
| SQL & NoSQL |
Joins, Window Functions, Aggregations, MongoDB, Indexing |
✔️ |
✔️ |
✔️ |
✔️ |
| Cloud for Data Science |
AWS S3, Lambda, SageMaker, GCP BigQuery, Azure ML |
|
✔️ |
✔️ |
✔️ |
| MLOps |
CI/CD, Model Deployment, Monitoring, ML Pipelines, Docker & Kubernetes |
|
✔️ |
✔️ |
✔️ |
| Feature Engineering |
Scaling, Encoding, PCA, Feature Selection, Time-Series Features |
✔️ |
✔️ |
✔️ |
✔️ |
| Time Series Analysis |
ARIMA, SARIMA, ETS, Prophet, LSTM, Forecasting |
|
✔️ |
✔️ |
✔️ |
| NLP |
Tokenization, Embeddings, Transformers, LLMs, Text Classification |
|
✔️ |
✔️ |
✔️ |
| Data Engineering |
ETL, Data Pipelines, Airflow, DB Design, Data Lakes |
|
✔️ |
✔️ |
✔️ |
| Model Optimization |
Hyperparameter Tuning, Grid/Random Search, Bayesian Opt., Pruning |
|
✔️ |
✔️ |
✔️ |
| Deployment & Scaling |
REST APIs, Flask/FastAPI, Batch vs Real-Time, Scaling Models |
|
✔️ |
✔️ |
✔️ |
| Data Ethics & Governance |
Fairness, Bias, Explainability (XAI), GDPR, Security |
✔️ |
✔️ |
✔️ |
✔️ |
| Mathematics for DS |
Linear Algebra, Calculus, Optimization, Vectorization |
✔️ |
✔️ |
✔️ |
✔️ |
| Business Analytics |
KPI Analysis, Dashboards, A/B Testing, Insights |
✔️ |
✔️ |
✔️ |
|
| Research & Experimentation |
Experiment Design, AB Testing, Simulation, Causal Inference |
|
✔️ |
✔️ |
✔️ |