Skip to content

DARREN-2000/made-template

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

79 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Methods of Advanced Data Engineering Template Project

This template project provides some structure for your open data project in the MADE module at FAU. This repository contains (a) a data science project that is developed by the student over the course of the semester, and (b) the exercises that are submitted over the course of the semester. Before you begin, make sure you have Python and Jayvee installed. We will work with Jupyter notebooks. The easiest way to do so is to set up VSCode with the Jupyter extension.

Project Work

Your data engineering project will run alongside lectures during the semester. We will ask you to regularly submit project work as milestones so you can reasonably pace your work. All project work submissions must be placed in the project folder.

Project Title:

Unveiling the Relationship between GDP and Inflation Rate and Predicting GDP using Regression in Germany

Author: Morris Darren Babu

Date: Januaury, 2024

Description:

This project investigates the relationship between gross domestic product (GDP) and inflation rate in Germany and utilizes regression analysis to forecast future GDP values based on inflation rate data.

Data Sources:

GDP data:

link to World Bank GDP data for Germany: https://data.worldbank.org/indicator/NY.GDP.MKTP.CD?locations=DE

Inflation rate data:

link to World Bank inflation data for Germany: https://data.worldbank.org/indicator/FP.CPI.TOTL.ZG?locations=DE

Data Preprocessing:

Exploratory Data Analysis (EDA):

Scatter plots: Visualize the relationship between GDP and inflation rate over time. Correlation matrix: Measure the correlation between GDP and inflation rate. Time series plots: Examine trends in GDP and inflation rate over time. Box plots: Identify outliers and assess the distribution of GDP and inflation rate data.

Regression Model Development:

Linear regression model: Employ a linear regression model to assess the linear relationship between GDP and inflation rate. Polynomial regression model: Investigate a polynomial regression model to capture potential nonlinear relationships. Logarithmic regression model: Consider a logarithmic regression model to account for exponential growth or decay.

Model Evaluation:

Mean Squared Error (MSE): Calculated the mean squared error to assess the model's overall performance. Root Mean Squared Error (RMSE): Measured the root mean squared error as a standardized metric of model accuracy. R-squared: Measured the proportion of variance in GDP explained by inflation rate.

GDP Prediction:

Model Application: Applied the trained regression model to predict future GDP values based on inflation rate data. Prediction Visualization: Presented predicted GDP values for a specified timeframe.

Model Validation:

Evaluated the model's performance on unseen data to assess itsgeneralizability.

Implications:

Understanding Inflation Rate Impact: The findings suggest that inflation rate plays a significant role in influencing GDP growth. Policymaking Guidance: Regression analysis can inform economic policies aimed at stabilizing inflation and promoting economic growth. Business Forecasting: Businesses can utilize the model to predict future economic conditions and make informed business plans.

Limitations and Future Directions:

The model performs well on historical data but may not be accurate in forecasting unforeseen events. Other economic factors, such as interest rates, unemployment rates, and government spending, may also influence GDP growth. Expanding the analysis to a broader range of countries could provide more generalizable insights. Incorporating advanced machine learning techniques could improve the forecasting accuracy.

Exporting a Jupyter Notebook

Jupyter Notebooks can be exported using nbconvert (pip install nbconvert). For example, to export the example notebook to html: jupyter nbconvert --to html examples/final-report-example.ipynb --embed-images --output final-report.html

Exercises

During the semester you will need to complete exercises, sometimes using Python, sometimes using Jayvee. You must place your submission in the exercises folder in your repository and name them according to their number from one to five: exercise<number from 1-5>.<jv or py>.

In regular intervalls, exercises will be given as homework to complete during the semester. We will divide you into two groups, one completing an exercise in Jayvee, the other in Python, switching each exercise. Details and deadlines will be discussed in the lecture, also see the course schedule. At the end of the semester, you will therefore have the following files in your repository:

  1. ./exercises/exercise1.jv or ./exercises/exercise1.py
  2. ./exercises/exercise2.jv or ./exercises/exercise2.py
  3. ./exercises/exercise3.jv or ./exercises/exercise3.py
  4. ./exercises/exercise4.jv or ./exercises/exercise4.py
  5. ./exercises/exercise5.jv or ./exercises/exercise5.py

Exercise Feedback

We provide automated exercise feedback using a GitHub action (that is defined in .github/workflows/exercise-feedback.yml).

To view your exercise feedback, navigate to Actions -> Exercise Feedback in your repository.

The exercise feedback is executed whenever you make a change in files in the exercise folder and push your local changes to the repository on GitHub. To see the feedback, open the latest GitHub Action run, open the exercise-feedback job and Exercise Feedback step. You should see command line output that contains output like this:

Found exercises/exercise1.jv, executing model...
Found output file airports.sqlite, grading...
Grading Exercise 1
	Overall points 17 of 17
	---
	By category:
		Shape: 4 of 4
		Types: 13 of 13

About

Template repository for the Methods of Advanced Data Engineering course at FAU

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages

  • Jupyter Notebook 91.2%
  • Python 7.9%
  • Shell 0.9%