Fake news detection in Philippine news corpus using LDA and sentiment analysis with machine learning
Abstract
The persistent proliferation of fake news on Philippine social media platforms poses serious threats to public discourse and safety. To address this growing concern, it is critical to continuously develop automated models that effectively classify online published news as either real or fake. This study presents an alternative approach to fake news classification by integrating VADER-extracted sentiment ratio and reduced feature vectors through Linear Discriminant Analysis (LDA) on a suite of supervised machine-learning models. We trained and evaluate these models on a publicly-available corpus of real and fake news from the Philippines. Remarkably, our best-performing model achieved an accuracy of 94% using only a single feature derived from LDA applied to a combination of TF-IDF features and sentiment ratio, comparable to benchmark models in the literature. Moreover, the addition of the sentiment ratio consistently improved performance across models. Overall, this study provides valuable insights for improving fake news classifiers for Philippine-based news corpus.
Downloads
Published
Issue
Section
License
By submitting their manuscript to the Samahang Pisika ng Pilipinas (SPP) for consideration, the Authors warrant that their work is original, does not infringe on existing copyrights, and is not under active consideration for publication elsewhere.
Upon acceptance of their manuscript, the Authors further agree to grant SPP the non-exclusive, worldwide, and royalty-free rights to record, edit, copy, reproduce, publish, distribute, and use all or part of the manuscript for any purpose, in any media now existing or developed in the future, either individually or as part of a collection.
All other associated economic and moral rights as granted by the Intellectual Property Code of the Philippines are maintained by the Authors.








