Machine learning for complex disease prediction: A case study for asthma dataset

Authors

  • Joverlyn Delayun Gaudillo Institute of Mathematical Sciences and Physics, University of the Philippines Los Baños
  • Lei Rigi Pastor Baltazar Institute of Mathematical Sciences and Physics, University of the Philippines Los Baños
  • Allen Nazareno Institute of Mathematical Sciences and Physics, University of the Philippines Los Baños, Philippines
  • Julianne Vilela Philippine Genome Center Program for Agriculture, Office of the Vice Chancellor for Research and Extension, University of the Philippines Los Baños
  • Jae Joseph Russell Rodriguez Institute of Biological Sciences, University of the Philippines Los Baños
  • Mario Domingo Domingo AI Research Center, Los Baños
  • Jason R Albia Institute of Mathematical Sciences and Physics, University of the Philippines Los Baños

Abstract

Machine learning is an alternative and powerful approach in analyzing high dimensional biological data to understand underlying complex phenomena. In this study, machine learning is used to analyze the single nucleotide polymorphism (SNP) profile of an individual in order to predict asthma occurrence at its onset stage. Machine learning algorithms such as support vector machine (SVM), k-nearest neighbors (kNN), random forest, and naïve Bayes were used on asthma case-control dataset. Results showed that SVM achieved the highest classification performance with accuracy, precision, sensitivity, and receiver operating characteristic (ROC), of 55.47%, 51.03%, 52.63%, and 0.52, respectively, which is comparable to other machine learning models. This study demonstrates the potential of machine learning to extensively analyze biological data and understand disease etiology for complex disease prediction.

Downloads

Published

2020-10-19

How to Cite

[1]
JD Gaudillo, LRP Baltazar, A Nazareno, J Vilela, JJR Rodriguez, M Domingo, and JR Albia, Machine learning for complex disease prediction: A case study for asthma dataset, Proceedings of the Samahang Pisika ng Pilipinas 38, SPP-2020-2A-01 (2020). URL: https://proceedings.spp-online.org/article/view/SPP-2020-2A-01.

Issue

Section

Complex Systems and Data Analytics (Short Presentations)