Explainable machine learning for multi-survey classification of astronomical objects

Authors

  • Zyrynn Jazmyn N. Muncal ⋅ PH Institute of Physics, University of the Philippines Los Baños
  • Mark Ruel D. Chua ⋅ PH Institute of Physics, University of the Philippines Los Baños
  • Ranzivelle Marianne L. Roxas-Villanueva ⋅ PH Institute of Physics, University of the Philippines Los Baños

Abstract

Classification of astronomical objects remains a fundamental yet challenging problem in astronomy, particularly in the current era of large-scale sky surveys, where the large volume of data makes manual classification impractical. To address this challenge, machine learning has recently emerged as a pivotal tool across various physics and interdisciplinary fields. This work presents a machine learning pipeline that classifies astronomical objects into three types: stars, galaxies, and quasi-stellar objects (QSOs), using two robust decision tree classifiers, XGBoost (XGB) and Random Forest (RF). The model uses cross-matched optical, infrared, and spectroscopic data from three astronomical surveys: the Sloan Digital Sky Survey (SDSS), Wide-field Infrared Survey Explorer (WISE), and Two-Micron All Sky Survey (2MASS). SHapely Additive exPlanations (SHAP) was used in decoding feature importance and its validity under physical laws. Overall the model achieved high mean accuracy, 98-99%, and high metrics across different classes. SHAP revealed that classification was primarily driven by redshift, morphological concentration, and photometric features, verifying that results aligned with SDSS classification criteria.

Published

2026-06-01

How to Cite

[1]
ZJN Muncal, MRD Chua, and RML Roxas-Villanueva, Explainable machine learning for multi-survey classification of astronomical objects, in Proceedings of the 44th Samahang Pisika ng Pilipinas Physics Conference (Philippines, 2026), SPP-2026-2A-01. URL: https://proceedings.spp-online.org/article/view/SPP-2026-2A-01