Notebooks and materials for DDL/CEB training.
-
Day 1: Visual Data Exploration
- Intro to matplotlib for visualization (the pyplot API)
- Data Frames for visual exploration
- Pandas plotting API
- Seaborn for visual statistical analysis
-
Day 2: Interactive Visualization with Bokeh
- Interactive analysis
- Clustering for high dimensional data reduction
- Visual Analysis: Overview first, zoom and filter, details on demand
-
Day 3: Regression Analysis
- General Linear Models
- Collinearity
- Regularization
- Ridge, Lasso, Elastic Net
- Regression Evaluation
-
Day 4: Classification Models
- Binary Classification vs. Multi-Class
- Decision Trees and Random Forrest
- kNN classification
- Bayesian Classifiers
- Logistic Regression
- SVMs
-
Day 5: Visual Diagnostics
- Evaluating classifiers: F1, Precision, Recall
- Confusion Matrices and visual confusion matrices
- ROC/AUC
- Residuals analysis, Prediction error
- tSNE and Dimensionality Reduction
-
Day 6: Hands-on project that ties everything together