• Home
  • About
  • Resume
  • Projects
  • Resources
    • Web-based Sources
    • Math Learning Plan
    • Advance Math & Techniques
    • Recommended Books
  • My Blog

Learning Plan: Data Science Mathematics

Author

Oscar Cardec

Published

February 10, 2025

Structured Learning Plan for Data Science Mathematics

This 12-week learning plan breaks down each topic into digestible weekly goals with recommended exercises and applications in data science.


Week 1: Fundamentals of Linear Algebra

πŸ“Œ Topics
- Introduction to Vectors and Matrices
- Matrix Operations: Addition, Multiplication, Transpose
- Identity, Inverse, and Special Matrices

πŸ“š Resources
- Gilbert Strang’s Linear Algebra and Its Applications
- Khan Academy: Linear Algebra Fundamentals

πŸ“ Exercises
- Solve basic matrix operations and vector manipulations
- Compute determinants and inverses manually and using NumPy

πŸ” Applications in Data Science
- Representing datasets as matrices
- Feature scaling and transformations


Week 2: Advanced Linear Algebra for Data Science

πŸ“Œ Topics
- Eigenvalues and Eigenvectors
- Singular Value Decomposition (SVD)
- Principal Component Analysis (PCA)

πŸ“š Resources
- Deep Learning by Ian Goodfellow (Chapter on Linear Algebra)
- 3Blue1Brown’s Essence of Linear Algebra series

πŸ“ Exercises
- Compute eigenvalues and eigenvectors using Python
- Perform PCA on a dataset using sklearn.decomposition.PCA

πŸ” Applications in Data Science
- Dimensionality reduction in high-dimensional datasets
- Image compression using SVD


Week 3: Differential Calculus

πŸ“Œ Topics
- Limits and Continuity
- Derivatives and Partial Derivatives
- Chain Rule and Gradient Calculation

πŸ“š Resources
- Calculus by James Stewart
- MIT OpenCourseWare: Single Variable Calculus
- 3Blue1Brown’s The essence of calculus series

πŸ“ Exercises
- Manually compute derivatives of polynomial and exponential functions
- Implement gradient computation in Python

πŸ” Applications in Data Science
- Compute loss function gradients in machine learning


Week 4: Integral & Vector Calculus

πŸ“Œ Topics
- Definite & Indefinite Integrals
- Gradient, Hessian, and Jacobian Matrices
- Applications in Probability

πŸ“š Resources
- Mathematics for Machine Learning by Deisenroth et al.

πŸ“ Exercises
- Compute multiple integrals manually and with Python (sympy.integrate())

πŸ” Applications in Data Science
- Computing expectations in probability distributions


Week 5: Probability Theory

πŸ“Œ Topics
- Basics of Probability
- Bayes’ Theorem & Conditional Probability
- Discrete & Continuous Distributions

πŸ“š Resources
- Probability and Statistics for Engineering and the Sciences by Jay Devore
- Khan Academy: Probability & Statistics

πŸ“ Exercises
- Solve probability problems manually
- Simulate probability distributions using NumPy

πŸ” Applications in Data Science
- NaΓ―ve Bayes classifier for text classification


Week 6: Statistics & Inference

πŸ“Œ Topics
- Descriptive Statistics
- Hypothesis Testing (T-tests, Chi-Square)
- Maximum Likelihood Estimation

πŸ“š Resources
- The Elements of Statistical Learning by Hastie, Tibshirani, Friedman
- Think Stats by Allen B. Downey

πŸ“ Exercises
- Perform hypothesis testing using scipy.stats
- Compute confidence intervals on sample datasets

πŸ” Applications in Data Science
- Feature selection and A/B testing


Week 7: Optimization Techniques

πŸ“Œ Topics
- Gradient Descent & Variants (SGD, Adam, RMSprop)
- Lagrange Multipliers
- Convex vs.Β Non-Convex Optimization

πŸ“š Resources
- Convex Optimization by Boyd & Vandenberghe

πŸ“ Exercises
- Implement gradient descent from scratch in Python
- Optimize logistic regression parameters using gradient descent

πŸ” Applications in Data Science
- Training machine learning models efficiently


Week 8: Numerical Methods

πŸ“Œ Topics
- Newton’s Method
- Matrix Factorization (LU, QR)
- Iterative Methods (Jacobi, Gauss-Seidel)

πŸ“š Resources
- Numerical Methods for Scientists and Engineers by R.W. Hamming

πŸ“ Exercises
- Solve non-linear equations using Newton’s method in Python

πŸ” Applications in Data Science
- Efficient computations in large-scale datasets


Week 9: Information Theory & Entropy

πŸ“Œ Topics
- Entropy and Mutual Information
- KL Divergence & Cross-Entropy
- Shannon’s Theorem

πŸ“š Resources
- Information Theory, Inference, and Learning Algorithms by David MacKay

πŸ“ Exercises
- Compute entropy of a dataset using Python

πŸ” Applications in Data Science
- Feature selection in decision trees


Week 10: Graph Theory

πŸ“Œ Topics
- Graph Representations & Adjacency Matrices
- Graph Traversal (DFS, BFS)
- PageRank Algorithm

πŸ“š Resources
- Networks, Crowds, and Markets by Easley & Kleinberg

πŸ“ Exercises
- Implement DFS and BFS in Python
- Compute PageRank on a sample graph

πŸ” Applications in Data Science
- Knowledge graph creation and analysis


Week 11: Time Series Analysis

πŸ“Œ Topics
- Stationarity & Differencing
- Autoregressive Models (AR, MA, ARIMA)
- Fourier Transforms

πŸ“š Resources
- Time Series Analysis and Its Applications by Shumway & Stoffer

πŸ“ Exercises
- Implement ARIMA models using statsmodels

πŸ” Applications in Data Science
- Financial and economic forecasting


Week 12: Final Project & Consolidation

🎯 Objective: Apply all learned concepts in a capstone project

πŸ“ Project Ideas
- Build a predictive model using PCA + Regression
- Implement an ML model and optimize it using gradient descent
- Perform statistical hypothesis testing on a real-world dataset


Β© Copyright 2025 Cardec Solutions

Created in

Mar 2025