This project investigates the relationship between social media usage and mental health symptoms. Using survey data (7 demographic/usage variables and 12 Likert-scale questions), the study explores:
- Correlation between social media usage patterns and mental health indicators.
- Predictive modeling to classify whether an individual is at risk of experiencing severe adverse mental health symptoms and should be recommended for a mental health check-up.
-
Total records: 480 valid responses
-
Features:
- Demographics: Age, Sex, Occupation
- Social Media Usage: Platforms Used, Daily Time Spent, Frequency of purposeless use
- Mental Health Indicators: ADHD, Anxiety, Self-Esteem, Depression (measured via 12 Likert-scale questions)
-
Derived Scores:
- ADHD Score
- Anxiety Score
- Self-Esteem Score
- Depression Score
- Total Score (aggregate of above; max = 59)
-
Outcome Variable:
0β Not severe (Total Score < 40)1β Severe symptoms, check-up recommended (Total Score β₯ 40)
- Cleaned and standardized demographic variables (e.g., grouped genders into Male, Female, Others).
- Converted Likert-scale responses into numerical values.
- Adjusted scoring for certain questions (e.g., self-esteem).
- Computed aggregated mental health scores.
- Encoded categorical features into numerical values.
The following machine learning models were trained and evaluated:
- Logistic Regression
- Gaussian Naive Bayes
- Random Forest Classifier
-
Logistic Regression achieved the best performance:
- Accuracy: 99.3%
- Precision: 1.0
- Recall: 0.983
- F1 Score: 0.991
-
Gaussian Naive Bayes and Random Forest also showed strong predictive power but slightly lower accuracy compared to Logistic Regression.
- Higher daily time spent on social media strongly correlated with adverse mental health symptoms.
- Elevated ADHD, Anxiety, and Depression scores significantly increased the likelihood of severe outcomes.
- Logistic Regression provided reliable and interpretable predictions for identifying at-risk individuals.