Identification of actionable tweets during Philippine disasters using machine learning

Authors

Alyanna T. Somido ⋅ PH National Institute of Physics, University of the Philippines Diliman
Reinabelle C. Reyes ⋅ PH National Institute of Physics, University of the Philippines Diliman

Abstract

Successful disaster response requires time-critical information, of which one abundant source is social media such as Twitter. However, because of the sheer volume of tweets, as well as their short, unstructured nature, it is challenging and time-consuming to find tweets that contain actionable information. In this study, we developed logistic regression binary classification models to classify tweets on whether they provide information about the on-the-ground situation during a disaster, i.e., their informativeness. We trained and tested our models using tweets from the CrisisLexT26 dataset concerning five disasters which occurred in the Philippines during 2012 to 2013. We compared models using different feature sets extracted using the Bag-of-Words (BoW), and TF-IDF vectorization methods, in conjunction with word embedding using word2vec, and evaluated their accuracy, precision, recall, AUC, and F1-scores. Our results indicate that relatively simple models using BoW and TF-IDF features are able to achieve good performance, comparable to that of more sophisticated models made for similar applications. We attribute this to the difference in scope, which points to the potential of a country-specific training approach. This is supported by our identification of the most important keywords in predicting the informativeness of a tweet. We found that words and hashtags related to calls for donation, impacts of disasters, and specific events are most associated with informative tweets while those associated with sympathy, such as 'pray,' and '#prayforvisayas,' were most associated with non-informative tweets. These models show promise in automating the identification of useful and relevant tweets to support disaster response efforts.

Downloads

Issue

2023: Proceedings of the 41st Samahang Pisika ng Pilipinas Physics Conference

Physics: Connecting islands of knowledge
19-21 July 2023, Del Carmen, Siargao Island

Please visit the SPP2023 activity webpage for more information on this year's Physics Congress.

Article ID

SPP-2023-PB-41

Section

Poster Session B (Complex Systems, Simulations, and Theoretical Physics)

Published

2023-07-14

How to Cite

[1]

AT Somido and RC Reyes, Identification of actionable tweets during Philippine disasters using machine learning, Proceedings of the Samahang Pisika ng Pilipinas 41, SPP-2023-PB-41 (2023). URL: https://proceedings.spp-online.org/article/view/SPP-2023-PB-41.