Fake news detection in Philippine news corpus using LDA and sentiment analysis with machine learning

Authors

  • Renz Jamuel B. Jamen National Institute of Physics, University of the Philippines Diliman
  • Reinabelle C. Reyes National Institute of Physics, University of the Philippines Diliman

Abstract

The persistent proliferation of fake news on Philippine social media platforms poses serious threats to public discourse and safety. To address this growing concern, it is critical to continuously develop automated models that effectively classify online published news as either real or fake. This study presents an alternative approach to fake news classification by integrating VADER-extracted sentiment ratio and reduced feature vectors through Linear Discriminant Analysis (LDA) on a suite of supervised machine-learning models. We trained and evaluate these models on a publicly-available corpus of real and fake news from the Philippines. Remarkably, our best-performing model achieved an accuracy of 94% using only a single feature derived from LDA applied to a combination of TF-IDF features and sentiment ratio, comparable to benchmark models in the literature. Moreover, the addition of the sentiment ratio consistently improved performance across models. Overall, this study provides valuable insights for improving fake news classifiers for Philippine-based news corpus.

Downloads

Issue

Article ID

SPP-2023-PB-40

Section

Poster Session B (Complex Systems, Simulations, and Theoretical Physics)

Published

2023-07-14

How to Cite

[1]
RJB Jamen and RC Reyes, Fake news detection in Philippine news corpus using LDA and sentiment analysis with machine learning, Proceedings of the Samahang Pisika ng Pilipinas 41, SPP-2023-PB-40 (2023). URL: https://proceedings.spp-online.org/article/view/SPP-2023-PB-40.