Topology-informed image classification of imbalanced datasets

Authors

  • Chara Deanna F. Punzal Data Science Program, University of the Philippines Diliman
  • Khristian G. Kikuchi Data Science Program, University of the Philippines Diliman and College of Computer and Information Science, Mapúa Malayan Colleges Laguna
  • Patricia S. Cabrera School of Archaeology, University of the Philippines Diliman
  • Gabrielle Anne B. Gascon School of Archaeology, University of the Philippines Diliman
  • Ranzivelle Marianne Roxas-Villanueva Institute of Physics, University of the Philippines Los Baños
  • Juan C. Rofes School of Archaeology, University of the Philippines Diliman and Archéozoologie, Archéobotanique – Sociétés, Pratiques et Environnements, CNRS/MNHN, France and National Museum of the Philippines
  • Giovanni A. Tapang National Institute of Physics, University of the Philippines Diliman

Abstract

This study explores the application of topological data analysis (TDA) to enhance image classification from an imbalanced dataset of micro-vertebrate bone fragments. We present a hybrid pipeline that integrates deep learning feature extraction, TDA-based topological representation, and gradient boosting classifiers. Our approach was evaluated on an archaeological dataset of bone fragment images from Callao Cave, Philippines. Results demonstrate that the TDA-enhanced pipeline consistently outperforms traditional machine learning methods, achieving 89-91% accuracy across LightGBM, XGBoost, and SVM classifiers. Notably, the TDA-based approach maintains robust performance (>82% accuracy) even when trained on just 10% of the available data, showing particular strength with imbalanced distributions. The findings highlight TDA as a valuable augmentation for image classification tasks in archaeological contexts, where limited and imbalanced datasets are common. This work contributes to both the methodological advancement of archaeological classification and the broader application of topological methods in machine learning.

Downloads

Published

2025-06-18

How to Cite

[1]
“Topology-informed image classification of imbalanced datasets”, Proc. SPP, vol. 43, no. 1, pp. SPP–2025, Jun. 2025, Accessed: Mar. 31, 2026. [Online]. Available: https://proceedings.spp-online.org/article/view/SPP-2025-3C-04