Here is the full reference list because of sapce we were not able to include them in the report.
[1] Virani SS, Alonso A, Benjamin EJ, et al; on behalf of the American Heart Association Council on Epidemiology and Prevention Statistics Committee and Stroke Statistics Subcommittee. Heart disease and stroke statistics- 2020 update: a report from the American Heart Association. Circulation. 2020;141:e1-e458. doi: 10.1161/CIR.0000000000000757.
[2] Nhanes - about the National Health and Nutrition Examination Survey. Centers for Disease Control and Prevention. Available at: https://wwwn.cdc.gov/Nchs/Nhanes/search/datapage.aspx?Component=Questionnaire&CycleBeginYear=2015.
[3] U.S. Healthcare Data. Kaggle. Available at: https://www.kaggle.com/datasets/maheshdadhich/us-healthcare-data?select=Nhanes_2013_2014.csv.
[4] Likebupt. Smote - Azure Machine Learning, Azure Machine Learning | Microsoft Learn. Available at:https://learn.microsoft.com/en-us/azure/machine-learning/component-reference/smote .
[5] Narkhede, S. (2021) Understanding AUC - roc curve, Medium. Towards Data Science. Available at: https://towardsdatascience.com/understanding-auc-roc-curve-68b2303cc9c5#:~:text=the%20multiclass%20model%3F-,What%20is%20the%20AUC%20%2D%20ROC%20Curve%3F,capable%20of%20distinguishing%20between%20classes.
[6] Lombardo, L., Cama, M., Conoscenti, C., Märker, M., & Rotigliano, E. J. N. H. (2015). Binary logistic regression versus stochastic gradient boosted decision trees in assessing landslide susceptibility for multiple-occurring landslide events: application to the 2009 storm event in Messina (Sicily, southern Italy). Natural Hazards, 79(3), 1621-1648.
[7] Zhou, Z. H., & Yu, Y. (2005). Adapt bagging to nearest neighbor classifiers. Journal of Computer Science and Technology, 20(1), 48-54.
[8] Tu, M. C., Shin, D., & Shin, D. (2009, December). A comparative study of medical data classification methods based on decision tree and bagging algorithms. In 2009 Eighth IEEE International Conference on Dependable, Autonomic and Secure Computing (pp. 183-187). IEEE.
[9] Acuna, E., & Rodriguez, C. (2004). The treatment of missing values and its effect on classifier accuracy. In Classification, clustering, and data mining applications (pp. 639-647). Springer, Berlin, Heidelberg.
[10] Windeatt, T. (2006). Accuracy/diversity and ensemble MLP classifier design. IEEE Transactions on Neural Networks, 17(5), 1194-1211.
[11] Zaytsev, Y. V., & Morrison, A. (2014). CyNEST: a maintainable Cython-based interface for the NEST simulator. Frontiers in neuroinformatics, 8, 23.
[12] Palach, J. (2014). Parallel programming with Python. Packt Publishing Ltd.
[13] Pola-Rs. Pola-Rs/Polars: Fast multi-threaded, hybrid-streaming DataFrame library in Rust: Python: Node.js, GitHub. Available at: https://github.com/pola-rs/polars
[14] Mehrtash et al. (1965) How to compare ROC AUC scores of different binary classifiers and assess statistical significance in python? (P-value, confidence interval), Stack Overflow. Available at: https://stackoverflow.com/questions/52373318/how-to-compare-roc-auc-scores-of-different-binary-classifiers-and-assess-statist
[15] Centers for Disease Control and Prevention (2017) National Health and Nutrition Examination Survey, Kaggle. Available at: https://www.kaggle.com/datasets/cdc/national-health-and-nutrition-examination-survey