Fake news detection in Philippine news corpus using LDA and sentiment analysis with machine learning

Authors

Renz Jamuel B. Jamen ⋅ PH National Institute of Physics, University of the Philippines Diliman
Reinabelle C. Reyes ⋅ PH National Institute of Physics, University of the Philippines Diliman

Abstract

The persistent proliferation of fake news on Philippine social media platforms poses serious threats to public discourse and safety. To address this growing concern, it is critical to continuously develop automated models that effectively classify online published news as either real or fake. This study presents an alternative approach to fake news classification by integrating VADER-extracted sentiment ratio and reduced feature vectors through Linear Discriminant Analysis (LDA) on a suite of supervised machine-learning models. We trained and evaluate these models on a publicly-available corpus of real and fake news from the Philippines. Remarkably, our best-performing model achieved an accuracy of 94% using only a single feature derived from LDA applied to a combination of TF-IDF features and sentiment ratio, comparable to benchmark models in the literature. Moreover, the addition of the sentiment ratio consistently improved performance across models. Overall, this study provides valuable insights for improving fake news classifiers for Philippine-based news corpus.

Downloads

Issue

2023: Proceedings of the 41st Samahang Pisika ng Pilipinas Physics Conference

Physics: Connecting islands of knowledge
19-21 July 2023, Del Carmen, Siargao Island

Please visit the SPP2023 activity webpage for more information on this year's Physics Congress.

Article ID

SPP-2023-PB-40

Section

Poster Session B (Complex Systems, Simulations, and Theoretical Physics)

Published

2023-07-14

How to Cite

[1]

RJB Jamen and RC Reyes, Fake news detection in Philippine news corpus using LDA and sentiment analysis with machine learning, Proceedings of the Samahang Pisika ng Pilipinas 41, SPP-2023-PB-40 (2023). URL: https://proceedings.spp-online.org/article/view/SPP-2023-PB-40.