Penerapan Metode ADASYN Dalam Mengatasi Imbalanced Data Untuk Klasifikasi Penyakit Stroke Menggunakan Support Vector Machine
DOI:
https://doi.org/10.47065/bulletincsr.v5i4.612Keywords:
Adaptive Synthetic Sampling Approach; Imbalanced Data; K-fold Cross Validation; Stroke; Support Vector MachineAbstract
Stroke is one of the leading causes of death and disability worldwide, making it essential to develop classification models that can assist in early and accurate diagnosis. This study aims to implement the Support Vector Machine (SVM) algorithm with three types of kernels linear, polynomial, and Radial Basis Function (RBF) to classify stroke disease data. The Adaptive Synthetic Sampling (ADASYN) method is employed to address the class imbalance problem, while model training and evaluation are carried out using 5-Fold Cross-Validation to ensure stable and reliable results. The findings indicate that ADASYN successfully improves the model’s sensitivity to stroke cases (the minority class), as reflected by an increase in recall and F1-score, despite a slight decrease in overall accuracy a common trade-off in handling imbalanced data. The linear kernel (after ADASYN) achieved the best performance after imbalance handling, with an average AUC-ROC of 0.8333, recall of 0.7827, and F1-score of 0.2181 for the stroke class. Although the F1-score remains relatively low, it improved compared to the pre-ADASYN results, indicating better detection of stroke cases. The implementation was conducted using Google Colab, which also contributed to efficient data processing and visualization. Overall, the results demonstrate that the combination of SVM and ADASYN is effective in enhancing the model’s sensitivity to minority classes and is well-suited for medical data classification tasks, particularly in the early diagnosis of stroke using machine learning approaches.
Downloads
References
D. E. Cahyani, “Penerapan Machine Learning Untuk Prediksi Penyakit Stroke,” J. Kaji. Mat. dan Apl., 2022, doi: 10.17977/um055v3i12022p15-22.
Y. Azhar, A. K. Firdausy, and P. J. Amelia, “Perbandingan Algoritma Klasifikasi Data Mining Untuk Prediksi Penyakit Stroke,” SINTECH (Science Inf. Technol. J., 2022, doi: 10.31598/sintechjournal.v5i2.1222.
E. Firmawati, E. Rochmawati, and I. Setyopranoto, “Deteksi Risiko Stroke Dan Edukasi Sebagai Upaya Pencegahan Primer Terjadinya Stroke,” J. SOLMA, 2023, doi: 10.22236/solma.v12i2.11834.
A. M. Ramadhan, J. S. Zahra, K. Al Rasyid, and D. O. W. Nugroho, “Aplikasi Forecasting Risiko Terkena Penyakit Stroke Menggunakan Program R-Shiny,” J. Sains dan Seni ITS, 2022, doi: 10.12962/j23373520.v11i3.62543.
Ardi Ramdani, Christian Dwi Sofyan, Fauzi Ramdani, Muhamad Fauzi Arya Tama, and Muhammad Angga Rachmatsyah, “Algoritma Klasifikasi Data Mining Untuk Memprediksi Masyarakat Dalam Menerima Bantuan Sosial,” J. Ilm. Sist. Inf., 2022, doi: 10.51903/juisi.v1i2.363.
K. Fithriasari, I. Hariastuti, and K. S. Wening, “Handling Imbalance Data in Classification Model with Nominal Predictors,” Int. J. Comput. Sci. Appl. Math., 2020, doi: 10.12962/j24775401.v6i1.6643.
Rahel Lina Simanjuntak, Rizki Agung Ramadhan, Theresia Romauli Siagian, and Vina Anggriani, “Komparasi Algoritma KNN dan SVM dalam Memprediksi Penyakit Stroke,” J. Tek. Mesin, Elektro dan Ilmu Komput., vol. 3, no. 3, pp. 60–74, 2023, doi: 10.55606/teknik.v3i3.2474.
U. Amelia, J. Indra, and A. F. N. Masruriyah, “Implementasi Algoritma Support Vector Machine (Svm) Untuk Prediksi Penyakit Stroke Dengan Atribut Berpengaruh,” Sci. Student J. Information, Technol. Sci., vol. III, no. 2, pp. 254–259, 2022.
M. Khushi et al., “A Comparative Performance Analysis of Data Resampling Methods on Imbalance Medical Data,” IEEE Access, 2021, doi: 10.1109/ACCESS.2021.3102399.
I. W. Dharmana, I. G. A. Gunadi, and L. J. E. Dewi, “Deteksi Transaksi Fraud Kartu Kredit Menggunankan Oversampling ADASYN dan Seleksi Fitur SVM-RFECV,” J. Teknol. Inf. dan Ilmu Komput., 2024, doi: 10.25126/jtiik.20241117640.
R. M. Munshi, “Novel ensemble learning approach with SVM-imputed ADASYN features for enhanced cervical cancer prediction,” PLoS One, 2024, doi: 10.1371/journal.pone.0296107.
I. Pratama, A. Y. Chandra, and P. T. Presetyaningrum, “Seleksi Fitur dan Penanganan Imbalanced Data menggunakan RFECV dan ADASYN,” J. Eksplora Inform., 2022, doi: 10.30864/eksplora.v11i1.578.
A. A. Rahman, S. S. Prasetiyowati, and Y. Sibaroni, “Performance Analysis Of The Imbalanced Data Method On Increasing The Classification Accuracy Of The Machine Learning Hybrid Method,” JIPI (Jurnal Ilm. Penelit. dan Pembelajaran Inform., 2023, doi: 10.29100/jipi.v8i1.3286.
B. J. Jansen, K. K. Aldous, J. Salminen, H. Almerekhi, and S. gyo Jung, “Data Preprocessing,” in Synthesis Lectures on Information Concepts, Retrieval, and Services, 2024. doi: 10.1007/978-3-031-41933-1_6.
C. Herdian, A. Kamila, and I. G. Agung Musa Budidarma, “Studi Kasus Feature Engineering Untuk Data Teks: Perbandingan Label Encoding dan One-Hot Encoding Pada Metode Linear Regresi,” Technol. J. Ilm., 2024, doi: 10.31602/tji.v15i1.13457.
V. Werner de Vargas, J. A. Schneider Aranda, R. dos Santos Costa, P. R. da Silva Pereira, and J. L. Victória Barbosa, “Imbalanced data preprocessing techniques for machine learning: a systematic mapping study,” Knowl. Inf. Syst., 2023, doi: 10.1007/s10115-022-01772-8.
R. Mia et al., “Exploring Machine Learning for Predicting Cerebral Stroke: A Study in Discovery,” Electron., 2024, doi: 10.3390/electronics13040686.
D. Valero-Carreras, J. Alcaraz, and M. Landete, “Comparing two SVM models through different metrics based on the confusion matrix,” Comput. Oper. Res., 2023, doi: 10.1016/j.cor.2022.106131.
F. O. Awalullaili, D. Ispriyanti, and T. Widiharih, “Klasifikasi Penyakit Hipertensi Menggunakan Metode Svm Grid Search Dan Svm Genetic Algorithm (Ga),” J. Gaussian, 2023, doi: 10.14710/j.gauss.11.4.488-498.
Y. A. Sir and A. H. H. Soepranoto, “Pendekatan Resampling Data Untuk Menangani Masalah Ketidakseimbangan Kelas,” J. Komput. dan Inform., 2022, doi: 10.35508/jicon.v10i1.6554.
G. Abdurrahman, “Klasifikasi Kanker Payudara Menggunakan Algoritma SVM dengan Kernel RBF, Linier, dan Sigmoid,” JUSTIFY J. Sist. Inf. Ibrahimy, 2023, doi: 10.35316/justify.v2i1.3370.
D. Nurlaily, Y. P. Irfandi, N. Santoso, S. Qomariyah, and D. Wibowo, “Classification of Hepatitis Patients Using Logistic Regression and Support Vector Machines Methods,” J. Pendidik. Mat., 2022, doi: 10.21043/jpmk.v5i2.17052.
Bila bermanfaat silahkan share artikel ini
Berikan Komentar Anda terhadap artikel Penerapan Metode ADASYN Dalam Mengatasi Imbalanced Data Untuk Klasifikasi Penyakit Stroke Menggunakan Support Vector Machine
ARTICLE HISTORY
How to Cite
Issue
Section
Copyright (c) 2025 Alwaliyanto, Siska Kurnia Gusti, Iis Afrianty, Fadhilah Syafria

This work is licensed under a Creative Commons Attribution 4.0 International License.
Authors who publish with this journal agree to the following terms:
- Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under Creative Commons Attribution 4.0 International License that allows others to share the work with an acknowledgment of the work's authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgment of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (Refer to The Effect of Open Access).













