- Using Reddit's API, I'll collect posts from two subreddits.
- I'll then use NLP to train a classifier on which subreddit a given post came from. This is a binary classification problem.
- Pulled posts from r/space and r/food for this experiment
- The code can be rerun with any 2 subreddits if the original url is changed
- Random Forest
- Gradient Boosting