Skip to content

cjc77/nlp-projects

Repository files navigation

nlp-projects

Personal projects using NLP techniques.

Table of Contents

Prerequisites

Project: Pitchfork Rating Prediction


Prerequisites

The following project requires python 3.10 and R 4.3 to be installed.

Dependencies (python)

To run the python code in this project, you will first need to install the relevant dependencies. This can be done by executing the following command from the project root:

pip install -r requirements.txt

Also, since this project contains a custom utilities library myutilpy, this must be installed to your environment. To do this, run the following command from the project root:

pip install -e myutilpy

Dependencies (R)

This project contains a few R language jupyter notebooks. To execute these, your R environment must have the dependencies specified in requirements_R.txt installed. This can be done manually for each listed dependency.

Pitchfork Rating Prediction

Directory: notebooks_pitchfork_ratings

This sequence of notebooks utilizes a Pitchfork reviews dataset of approximately 20K album reviews (mattismegevand/pitchfork). The notebooks cover the following steps:

  1. Data preprocessing (01_initial_data_prep). Loading, cleaning, and preprocessing of data.
  2. Exploratory data analysis (02_data_explore). Visualization and summary statistics of processed dataset.
  3. Model fitting and prediction (03_rating_pred). Model fitting and saving of model parameters. Also, collection and save-out of performance metrics and test set predictions.
  4. Results analysis (04_fit_analysis). Post-fit investigation of model performance on test data.

About

Personal projects related to NLP

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors