Must know stats for ML practitioners:
- Know what a p-value is and its limitations in decisions.
- Linear regression and its assumptions.
- When to use different statistical distributions.
- How an effect size impacts results/decisions.
- Mean, variance for Normal, Uniform, Poisson.
- Sampling techniques and common designs (e.g. A/B).
- Bayes’ theorem (applied calculations).
- Confidence intervals measurement and interpretation.
- Logistic regression and ROC curves.
- Resampling (Cross validation + bootstrapping).
- Dimensionality reduction.
- Tree-based models (particularly how to prune).
- Ridge and Lasso for regression.
🤓Time to brush up my statistics…