Machine learning for complex disease prediction: A case study for asthma dataset

Authors

Joverlyn Delayun Gaudillo ⋅ PH Institute of Mathematical Sciences and Physics, University of the Philippines Los Baños
Lei Rigi Pastor Baltazar ⋅ PH Institute of Mathematical Sciences and Physics, University of the Philippines Los Baños
Allen Nazareno ⋅ PH Institute of Mathematical Sciences and Physics, University of the Philippines Los Baños, Philippines
Julianne Vilela ⋅ PH Philippine Genome Center Program for Agriculture, Office of the Vice Chancellor for Research and Extension, University of the Philippines Los Baños
Jae Joseph Russell Rodriguez ⋅ PH Institute of Biological Sciences, University of the Philippines Los Baños
Mario Domingo ⋅ PH Domingo AI Research Center, Los Baños
Jason R Albia ⋅ PH Institute of Mathematical Sciences and Physics, University of the Philippines Los Baños

Abstract

Machine learning is an alternative and powerful approach in analyzing high dimensional biological data to understand underlying complex phenomena. In this study, machine learning is used to analyze the single nucleotide polymorphism (SNP) profile of an individual in order to predict asthma occurrence at its onset stage. Machine learning algorithms such as support vector machine (SVM), k-nearest neighbors (kNN), random forest, and naïve Bayes were used on asthma case-control dataset. Results showed that SVM achieved the highest classification performance with accuracy, precision, sensitivity, and receiver operating characteristic (ROC), of 55.47%, 51.03%, 52.63%, and 0.52, respectively, which is comparable to other machine learning models. This study demonstrates the potential of machine learning to extensively analyze biological data and understand disease etiology for complex disease prediction.

Downloads

Issue

2020: Proceedings of the 38th Samahang Pisika ng Pilipinas Physics Conference

Shedding light on the pandemic through the lens of physics
Pagtanglaw sa pandemya sa lente ng pisika
19-23 October 2020

This is the first fully online SPP Physics Conference. Please visit the SPP2020 activity webpage for more information on this year's Physics Congress.

Article ID

SPP-2020-2A-01

Section

Complex Systems and Data Analytics (Short Presentations)

Published

2020-10-19

How to Cite

[1]

JD Gaudillo, LRP Baltazar, A Nazareno, J Vilela, JJR Rodriguez, M Domingo, and JR Albia, Machine learning for complex disease prediction: A case study for asthma dataset, Proceedings of the Samahang Pisika ng Pilipinas 38, SPP-2020-2A-01 (2020). URL: https://proceedings.spp-online.org/article/view/SPP-2020-2A-01.