GitHub - farshidhesami/Machine_learning_algorithm

Training Machine Learning Algorithm

CO2 emissions canada 2023 Prediction

Working on a regression problem using various machine learning models to predict " CO2 emissions canada 2023 " based on features such as engine size, fuel consumption, and cylinders. Here's a breakdown of the code you provided:

Importing libraries:
- Pandas: for data manipulation and analysis
- NumPy: for numerical operations
- Matplotlib: for data visualization
- Seaborn: for statistical data visualization
- Various modules from scikit-learn: for machine learning algorithms and evaluation metrics
Loading the dataset:
- The dataset is loaded from a CSV file called 'FuelConsumption2023.csv' using pd.read_csv().
- The columns of interest are selected and stored in the 'df' DataFrame.
Data exploration:
- Dataset dimensions are printed using df.shape.
- The first few rows of the dataset are displayed using df.head().
- Data types of each column are printed using df.dtypes.
- The number of missing values in each column is displayed using df.isnull().sum().
- Summary statistics of numerical columns are printed using df.describe().
- Unique values in categorical columns (if any) are displayed using a loop.
Data preprocessing:
- Missing values are dropped from the dataset using df.dropna().
- One-hot encoding is performed on categorical columns (if any) using pd.get_dummies().
- Feature scaling or normalization is performed on the encoded dataset using MinMaxScaler().
Data visualization:
- The distribution of numeric columns is visualized using histograms and KDE plots with the help of sns.histplot().
- Relationships between variables are visualized using scatter plots with regression lines using sns.regplot().
Model training and evaluation:
- Selected features and target variable are assigned to 'X' and 'y', respectively.
- The data is split into training and testing sets using train_test_split().
- Several regression models are trained on the training set and evaluated on the testing set:
  - Linear Regression (LR)
  - Support Vector Regression (SVR)
  - Multilayer Perceptron (MLP)
  - Decision Tree (Regression)
  - Random Forest
  - Gradient Boosting (GB)
  - K-Nearest Neighbors (KNN)
- Evaluation metrics (MSE and R-squared) are calculated using mean_squared_error() and r2_score().
- Scatter plots of actual vs predicted values are plotted using plt.scatter().
Model tuning:
- Hyperparameter grids for each model are defined.
- Grid search is performed for each model using GridSearchCV() to find the best hyperparameters.
- Tuned models are evaluated and scatter plots of actual vs predicted values are plotted.

Name		Name	Last commit message	Last commit date
Latest commit History 21 Commits
Scripts		Scripts
templates		templates
training_course		training_course
.gitignore		.gitignore
Co2_emmision.ipynb		Co2_emmision.ipynb
FuelConsumption.ipynb		FuelConsumption.ipynb
FuelConsumption2023.csv		FuelConsumption2023.csv
LICENSE		LICENSE
Machine_learning_algorithem.ipynb		Machine_learning_algorithem.ipynb
Machine_learning_algorithem_01.ipynb		Machine_learning_algorithem_01.ipynb
Machine_learning_algorithem_02.ipynb		Machine_learning_algorithem_02.ipynb
Prediction_california_house-price.csv		Prediction_california_house-price.csv
README.md		README.md
app.py		app.py
predict_pipeline.py		predict_pipeline.py
pyvenv.cfg		pyvenv.cfg

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Training Machine Learning Algorithm

CO2 emissions canada 2023 Prediction

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Training Machine Learning Algorithm

CO2 emissions canada 2023 Prediction

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages