This project seeks to answer the following questions:
- Is it possible to effectively classify posts on online forums as hate speech or not hate speech?
- Is it possible to procedurally generate interventionary responses to such instances of online hate speech?
For all the fun details about our process and results, check out our project write-up.
The data this project uses comes from A Benchmark Dataset for Learning to Intervene in Online Hate Speech.
- To run the classifier, run
python3 main.pyin the directory containing main.py. To see the process we used to test different classification models and hyperparameters, uncomment the block of code involving theperformance_testervariable in main.py before running the program. - To run the Textgenrnn response generator, run
python3 text_generation.pyin the same directory. - To run the Sequence-to-Sequence response generator, run
python3 seq2seq.pyin the same directory.
Our code works best using Keras version 2.2.4, installed through regular Pip (i.e. not Anaconda).