This repository presents a collection of machine learning notebooks demonstrating the use of diverse algorithms and data-driven techniques — from classical models to deep learning and unsupervised methods.
Each project focuses on a distinct area of ML application, combining data preprocessing, model development, and evaluation.
| # | Notebook | Focus Area |
|---|---|---|
| 1 | 01_intro_to_ml.ipynb | Foundational machine learning models — exploring supervised classification and evaluation metrics. |
| 2 | 02_model_evaluation_and_tuning.ipynb | Model optimization through preprocessing, scaling, and hyperparameter tuning with GridSearchCV. |
| 3 | 03_svm_kddcup99_network_intrusion.ipynb | SVM-based network intrusion detection using multiple kernels and feature selection via RFE. |
| 4 | 04_ensemble_learning_methods.ipynb | Ensemble learning techniques such as Random Forest, AdaBoost, and Gradient Boosting for improved prediction accuracy. |
| 5 | 05_neural_networks_basics.ipynb | Implementation of neural network architectures using TensorFlow/Keras, including MLPs and CNNs. |
| 6 | 06_unsupervised_learning_pca_kmeans.ipynb | Dimensionality reduction and clustering using PCA, K-Means, and t-SNE visualizations. |
- Data Preprocessing: Handling missing values, encoding, normalization, and class balancing.
- Supervised Learning: Logistic Regression, Decision Trees, SVMs, and Ensemble Models.
- Model Evaluation: Cross-validation, precision, recall, F1-score, and confusion matrices.
- Feature Selection: Recursive Feature Elimination (RFE) and feature importance analysis.
- Neural Networks: Model design and training using Keras/TensorFlow.
- Unsupervised Learning: PCA, K-Means, and cluster visualization.
-
Intrusion Detection with SVMs:
Compared linear, polynomial, RBF, and sigmoid kernels on the KDDCup99 dataset, achieving ~99–100% accuracy.
Identified key predictive features (countanddst_host_diff_srv_rate) using recursive feature elimination. -
Model Optimization:
Applied grid search and cross-validation to enhance generalization and stability across models. -
Deep Learning:
Built feedforward and convolutional architectures for image recognition and pattern detection. -
Unsupervised Exploration:
Used PCA and t-SNE to visualize clusters and interpret latent data structure.
- Python 3.8+
- pandas, numpy, matplotlib, seaborn
- scikit-learn
- keras / tensorflow
- tqdm