Skip to content

cliffton/EmailClassification

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Email Classification using Parallel Computing

This project is an email classifier that uses k-NN statistical analysis to classify emails as spam or ham. It uses the Parallel Java 2 library to run the analysis in parallel thus increasing throughput. The dataset used is the enron email dataset which contains 0.5 million email records.

Files

  1. src : Contains the source files
  2. sampleFiles: Contains sample files that can be used to run the program.
  3. dataFiles: Contains data files used to train and test the system.
  4. scripts: Contains python scripts used to clean the data files.
  5. Report.pdf: Contains the report of the entire project along with conclusions, results and how to install and run the system.

Requirements

Parallel Java 2 Library

Installation and Running

Detailed installation and running guide can be found in the Report.pdf file.

Results

enter image description here

Contributors

About

Email Classification using Parallel Computing

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •