← Course
Data Science & Analytics (2026)
Roadmap from analytics foundations to AI integration—Git, SQL, Python (Pandas/Polars), EDA, ML with proper evaluation, pipelines (Airflow/Dagster), and GenAI. Includes milestone projects.
PythonSQLLast updated 3 Feb 2026
Level 1: Data Foundations & Analytics
Focus on the most in-demand industry skills: version control, pulling data, exploratory analysis, and building reports. EDA is introduced early so you don’t jump into ML without insight.
Session 1: The Data Mindset & Environment Setup
- Set up Python (Anaconda/VS Code) and intro to Jupyter Notebook.
- Git basics: version control for scripts and notebooks (commit, branch, push); why it matters for reproducible work.
- Prompt engineering intro: GenAI mindset—how to ask models for code, explanations, and checks; sets the stage for later AI-assisted analysis.
- Why Data Science in 2026 is different (AI vs Human roles).
Session 2: SQL Masterclass (Part 1)
- Basic queries: SELECT, FROM, WHERE, ORDER BY.
- Filtering & aggregation: GROUP BY, HAVING, COUNT/SUM/AVG.
Session 3: SQL Masterclass (Part 2)
- Relational databases: JOIN (Inner, Left, Right).
- Advanced SQL: Common Table Expressions (CTE) and subqueries.
Session 4: Python for Data (Pandas & Polars)
- Reading various file types (CSV, Excel, JSON).
- Table manipulation: filtering, sorting, and creating new columns.
- Polars vs Pandas: when to use which; quick benchmark (speed, memory) so you can choose the right tool for scale.
Session 5: Data Cleaning & Wrangling
- Handling missing values and duplicate data.
- Data transformation: date, category, and string formatting.
- EDA preview: spot correlation and outliers during cleaning so you’re not blind when you hit ML—simple checks (e.g. value ranges, duplicates by key) and when to dig deeper.
Session 6: Data Storytelling, Visualization & EDA
- EDA principles: distribution, outliers, and “what does this variable tell us?” before you build anything.
- Visual design principles: choosing the right chart (Bar, Line, Scatter).
- Business context & domain: why this data exists, what decisions it supports—don’t ignore the “why” behind the numbers.
- Tools: intro to Tableau or Power BI to build your first dashboard.
Level 2: Machine Learning & Statistical Thinking
Focus on prediction logic and rigorous model evaluation so you can defend your work in practice.
Session 7: Statistics for Practical Analysts
- Data distribution, outliers, and correlation (why A relates to B).
- Ties back to EDA from Level 1; formalize the intuition you built.
Session 8: Supervised Learning (Regression)
- Predicting numbers: e.g. rental price or monthly income.
- Model evaluation: metrics (RMSE, MAE, R²), train/validation split, overfitting checks.
- Simple hyperparameter tuning: e.g. grid search on one or two knobs so you don’t ship default-only models.
Session 9: Supervised Learning (Classification)
- Predicting categories: e.g. spam classification or fraud detection.
- Metrics: accuracy, precision, recall, F1; when to optimize for which (e.g. recall in fraud).
- Cross-validation and overfitting checks; simple hyperparameter tuning where it matters.
Session 10: Unsupervised Learning (Clustering)
- Segmentation: grouping customers by behavior.
- Evaluation: silhouette, inertia; how to sanity-check clusters and avoid over-interpreting.
Level 3: The "Engineer" Edge (2026 Special)
The part that makes your mentoring valuable—pipelines, cloud, AI, and deployment.
Session 11: Automated Data Pipeline (The Mini ETL)
- Build Python scripts to pull data from APIs or databases on a schedule.
- Scheduling: Airflow or Dagster intro—DAGs, tasks, and running pipelines reliably.
- Cloud basics: GCP or AWS free tier—run a small pipeline in the cloud so you understand scalability beyond your laptop.
Session 12: AI Agents for Data Analysis
- Integrate Gemini/OpenAI APIs for automated text analysis (sentiment, summarization).
- Prompt patterns for data tasks (generating code, explaining results).
Session 13: Model Deployment (From Local to API)
- Wrap your ML model as a simple API with FastAPI so others can use it.
- Versioning and basic monitoring (e.g. logging inputs/outputs).
Session 14: Final Project Review & Portfolio Building
- Milestone projects: 2–3 end-to-end examples (e.g. fraud detection from data → EDA → model → evaluation → simple API or report) so you can show “I can ship it.”
- Tips to showcase your work on GitHub (repos, README, Git history) and LinkedIn so recruiters notice.
- How to talk about domain and “why” in interviews.
Interested in this course? I offer mentoring and structured learning—get in touch to discuss your goals.