Performance Evaluation of Regression Models in Predicting the Cost of Medical Insurance

  • Jonelle Angelo S. Cenita Richwell Colleges, Incorporated, Philippines
  • Paul Richie F. Asuncion Polytechnic College of Botolan, Philippines
  • Jayson M. Victoriano Bulacan State University, Philippines

Abstract

Purpose – The study aimed to evaluate the regression models' performance in predicting the cost of medical insurance. The Three (3) Regression Models in Machine Learning namely Linear Regression, Gradient Boosting, and Support Vector Machine were used. The performance will be evaluated using the metrics RMSE (Root Mean Square), r2 (R Square), and K-Fold Cross-validation. The study also sought to pinpoint the feature that would be most important in predicting the cost of medical insurance.

Method – The methodology of the study is anchored on the knowledge discovery in databases (KDD) process. (KDD) process refers to the overall process of discovering useful knowledge from data.

Results – The performance evaluation results reveal that among the three (3) Regression models, Gradient boosting received the highest r2 (R Square) 0.892 and the lowest RMSE (Root Mean Square) 1336.594.  Furthermore, the 10-Fold Cross-validation weighted mean findings are not significantly different from the r2 (R Square) results of the three (3) regression models. In addition, Exploratory Data Analysis (EDA) using a box plot of descriptive statistics observed that in the charges and smoker features the median of one group lies outside of the box of the other group, so there is a difference between the two groups.

Conclusion – In conclusion, Gradient boosting appears to perform better among the three (3) regression models. K-Fold Cross-Validation concluded that the three (3) regression models are good. Moreover, Exploratory Data Analysis (EDA) using a box plot of descriptive statistics ceases that the highest charges are due to the smoker feature.

Recommendations – Gradient boosting model be used in predicting the cost of medical insurance.

Research Implications – Utilizing an accurate regression model to predict medical costs can aid medical insurance organizations in prioritizing the allocation of limited care management resources as it plays a role in the development of insurance policies.

Author Biographies

Jonelle Angelo S. Cenita, Richwell Colleges, Incorporated, Philippines

Jonelle Angelo S. Cenita is a college instructor at Richwell Colleges, Incorporated. He is currently enrolled in the program of Doctor of Information Technology at La Consolacion University Philippines and is a graduate of Master of Science in Information technology and Bachelor of Science in Information Technology.

Paul Richie F. Asuncion, Polytechnic College of Botolan, Philippines

Paul Richie F. Asuncion is a faculty member of Polytechnic College of Botolan in Botolan, Zambales, and is currently designated as the dean of the Institute of Computing Studies. He is currently pursuing Doctor in Information Technology at La Consolacion University Philippines in Malolos, Bulacan. He obtained his master's degree at the University of the Philippines Open University under the Master of Information Systems program. His research interest includes Information System development, data mining, artificial intelligence, and things relative to IT education administration.

Jayson M. Victoriano, Bulacan State University, Philippines

Dr. Jayson M Victoriano is the Program Chair of BS Data Science at Bulacan State University-Sarmiento Campus, He is also a member of the prestigious National Research Council of the Philippines, He is also the Current Director for Research and Innovation for External Campuses at the same University.

Published
2023-04-20
How to Cite
CENITA, Jonelle Angelo S.; ASUNCION, Paul Richie F.; VICTORIANO, Jayson M.. Performance Evaluation of Regression Models in Predicting the Cost of Medical Insurance. International Journal of Computing Sciences Research, [S.l.], v. 7, p. 2052-2065, apr. 2023. ISSN 2546-115X. Available at: <//stepacademic.net/ijcsr/article/view/417>. Date accessed: 19 apr. 2024.
Section
Special Issue: IRCCETE 2023