Filipino text classification by Universal Language Model Fine-tuning (ULMFiT)

Authors

  • Mary June A. Ricaña National Institute of Physics, University of the Philippines Diliman
  • Francis N. C. Paraan National Institute of Physics, University of the Philippines Diliman

Abstract

One of the major obstacles in natural language processing is the scarcity of labeled data for some languages. To tackle this issue, transfer learning techniques like Universal Language Model Fine-tuning (ULMFiT) have emerged as effective solutions. This research paper explores the utilization of ULMFiT for addressing text classification challenges in the Filipino language. We follow the ULMFiT approach, involving pretraining a language model, fine-tuning it, and developing a text classifier. We independently reproduce previous results for a binary text classification task on a dataset of text in Filipino. Additionally, we demonstrate the promising performance of the ULMFiT model on a multi-label classification task, achieving hamming losses as low as ~0.10, which are comparable to previous benchmark results obtained with transformer models.

Downloads

Issue

Article ID

SPP-2023-PB-06

Section

Poster Session B (Complex Systems, Simulations, and Theoretical Physics)

Published

2023-07-09

How to Cite

[1]
MJA Ricaña and FNC Paraan, Filipino text classification by Universal Language Model Fine-tuning (ULMFiT), Proceedings of the Samahang Pisika ng Pilipinas 41, SPP-2023-PB-06 (2023). URL: https://proceedings.spp-online.org/article/view/SPP-2023-PB-06.