Fake-News-Detection-BERT

This project trains a BERT-based classifier to detect fake news using HuggingFace Transformers and PyTorch.

Files

Fake.csv – Dataset of fake news articles
True.csv – Dataset of real news articles
main.py – Main training and evaluation script
data_analysis.ipynb – Notebook for dataset exploration and visualization
bert-base-uncased/ – (Optional) Local BERT model directory (or use HuggingFace download)

Requirements

Make sure you have Python 3.7+ and install the following packages:

pip install transformers datasets scikit-learn pandas numpy torch nltk

You also need to download NLTK stopwords:

import nltk
nltk.download('stopwords')

Data Analysis

Run data_analysis.ipynb to explore and visualize the dataset. It performs the following:

Loads and merges Fake.csv and True.csv
Assigns labels (0 = Fake, 1 = Real)
Samples 3000 articles for faster experimentation
Visualizes class distribution:
- Fake vs Real label balance (relatively balanced)
- Distribution of subject categories by label (unbalanced, not used as feature)
Observes that the date field contains some noisy or invalid strings (e.g., URLs), so date is excluded as a feature

The analysis helps confirm that only the text field is suitable as a classification input.

Fine-tuning BERT

Make sure Fake.csv and True.csv are in the same folder as main.py.
(Optional) If using a local model, ensure bert-base-uncased/ is in the same directory and modify this line in the code:

bert_name = "./bert-base-uncased" # path to your local model

Otherwise, the model will be downloaded automatically from HuggingFace.

Run training:

python main.py

After training, the best model is saved in the ./results directory and evaluation metrics will be printed.

Output

Training progress and validation metrics printed during training
Final test accuracy, precision, recall, and F1 score

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Fake-News-Detection-BERT

Files

Requirements

Data Analysis

Fine-tuning BERT

Output

About

Uh oh!

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
Fake.csv		Fake.csv
README.md		README.md
True.csv		True.csv
data_analysis.ipynb		data_analysis.ipynb
main.py		main.py

20040628/Fake-News-Detection

Folders and files

Latest commit

History

Repository files navigation

Fake-News-Detection-BERT

Files

Requirements

Data Analysis

Fine-tuning BERT

Output

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages