Kinerja Metode Fine-Tuning IndoBERT untuk Klasifikasi Emosi Multi-Kelas pada Teks Informal Bahasa Indonesia

Haikal Fawwaz Karim; Adityo Permana Wibowo

doi:10.47065/bulletincsr.v6i1.850

Authors

Haikal Fawwaz Karim Universitas Teknologi Yogyakarta, Sleman, Indonesia
Adityo Permana Wibowo Universitas Teknologi Yogyakarta, Sleman, Indonesia

DOI:

https://doi.org/10.47065/bulletincsr.v6i1.850

Keywords:

Emotion Classification; IndoBERT; Fine-Tuning; Informal Text; Twitter

Abstract

Automatic emotion analysis on informal Indonesian texts is a challenging task due to high linguistic variation, the use of slang, and abbreviations. This research focuses on the development and evaluation of an accurate emotion classification model, which can serve as a core component various relevant Natural Language Processing (NLP) applications. The proposed method is the fine-tuning of the pre-trained language model IndoBERT to classify texts from the social media platform Twitter (X) into five emotion classes: anger, fear, happy, love, and sadness. A custom dataset consisting of 4,940 Twitter posts was built through a targeted scraping process and statistically validated labeling to ensure data relevance and balance. Experiments show that after undergoing a comprehensive text preprocessing stage, including normalization using a custom abbreviation dictionary and stemming, the fine-tuned model achieved very high performance. Evaluation results on the test data show the model successfully reached an accuracy of 94% and a weighted average F1-score of 0.94. Learning curve analysis also confirms that the model did not suffer from overfitting and possesses good generalization capabilities. These results demonstrate that the IndoBERT fine-tuning approach is a highly effective and reliable solution for emotion classification in the informal Indonesian text domain.

Downloads

Download data is not yet available.

References

A. Hasnining and Y. Hazriani, “Text Mining Untuk Klasifikasi Emosi Pengguna Media Sosial Degan Algoritma Naïve Bayes,” ARTHA Technological Journal, vol. 7, pp. 57–67, 2023, doi: 10.33857/patj.v7i1.671.

M. Mustak, H. Hallikainen, T. Laukkanen, L. Plé, L. D. Hollebeek, and M. Aleem, “Using Machine Learning to Develop Customer Insights From User-Generated Content,” Journal of Retailing and Consumer Services, vol. 81, p. 104034, Nov. 2024, doi: 10.1016/j.jretconser.2024.104034.

M. I. Raif, N. N. Hidayati, and T. Matulatan, “Otomatisasi Pendeteksi Kata Baku dan Tidak Baku pada Data Twitter Berbasis KBBI,” Jurnal Teknologi Informasi dan Ilmu Komputer, vol. 11, no. 2, pp. 337–348, 2024, doi: 10.25126/jtiik.20241127404.

M. I. Maulana, M. Fikry, S. Agustian, S. Ramadhani, and others, “Analisis Sentimen Ulasan Aplikasi Indodax Pada Google Play Store Dengan Algoritma Random Forest,” Bulletin of Computer Science Research, vol. 5, no. 4, pp. 564–572, 2025, doi: 10.47065/bulletincsr.v5i4.626.

D. E. Sondakh, R. C. Maringka, F. P. Ayorbaba, J. S. Mangi, and S. R. Pungus, “Emotion Mining User Review of the BRImo Mobile Banking Application Using the Decision Tree Algorithm,” Jurnal Sisfokom (Sistem Informasi dan Komputer), vol. 12, no. 3, pp. 350–355, 2023, doi: 10.32736/sisfokom.v12i3.1721.

J. Husna and M. D. A. Widirahayu, “Pemodelan Topik dan Analisis Sentimen pada ‘Voices of History: 50 Iconic Speeches’ Menggunakan Pendekatan Natural Language Processing,” Jurnal Ilmiah Manajemen Informasi dan Komunikasi, vol. 8, no. 1, pp. 15–24, 2024, doi: 10.56873/jimik.v8i1.330.

D. E. Putro, D. Juarsa, B. P. P. Hermana, B. Bagastian, and H. Sulistiani, “Analisis Sentimen Publik terhadap ‘Save Raja Ampat’di Media Sosial Menggunakan Model IndoBERT,” Bulletin of Computer Science Research, vol. 5, no. 5, pp. 1067–1075, 2025, doi: 10.47065/bulletincsr.v5i5.621.

D. G. Mandhasiya, H. Murfi, and A. Bustamam, “The Hybrid of Bert and Deep Learning Models for Indonesian Sentiment Analysis,” Indonesian Journal of Electrical Engineering and Computer Science, vol. 33, no. 1, pp. 591–602, 2024, doi: 10.11591/ijeecs.v33.i1.pp591-602.

M. F. Kono, I. N. Fajri, and Y. Pristyanto, “Public Sentiment Analysis on Corruption Issues in Indonesia Using IndoBERT Fine-Tuning, Logistic Regression, and Linear SVM,” Journal of Applied Informatics and Computing, vol. 9, no. 5, pp. 2616–2628, 2025, doi: 10.30871/jaic.v9i5.10537.

A. Safira and F. N. Hasan, “Analisis Sentimen Masyarakat Terhadap Paylater Menggunakan Metode Naive Bayes Classifier,” ZONAsi: Jurnal Sistem Informasi, vol. 5, no. 1, pp. 59–70, 2023, doi: 10.31849/zn.v5i1.12856.

S. Mulyani, S. A. Thamrin, and S. Siswanto, “Analisis Sentimen Masyarakat pada Kebijakan Vaksinasi Covid-19 di Twitter Menggunakan Metode Mesin Vektor Pendukung dengan Kernel Radial Basis Function Berbasis Fitur Leksikon,” Jambura Journal of Probability and Statistics, vol. 3, no. 2, pp. 110–119, 2022, doi: 10.34312/jjps.v3i2.16663.

Y. Romadhoni and K. F. H. Holle, “Analisis Sentimen Terhadap PERMENDIKBUD No. 30 pada Media Sosial Twitter Menggunakan Metode Naive Bayes dan LSTM,” Jurnal Informatika: Jurnal Pengembangan IT, vol. 7, no. 2, pp. 118–124, 2022, doi: 10.30591/jpit.v7i2.3191.

A. Hasiholan, I. Cholissodin, and N. Yudistira, “Analisis Sentimen Tweet Covid-19 Varian Omicron pada Platform Media Sosial Twitter menggunakan Metode LSTM berbasis Multi Fungsi Aktivasi dan GLOVE,” Jurnal Pengembangan Teknologi Informasi Dan Ilmu Komputer, vol. 6, no. 10, pp. 4653–4661, 2022.

P. T. Astuti, “Sentiment Analysis Sudut Pandang Generasi Z terhadap Keterwakilan Kepemimpinan Muda Pilpres di Twitter Menggunakan ID Convolutional Neural Network,” Djtechno: Jurnal Teknologi Informasi, vol. 5, no. 2, pp. 275–288, 2024, doi: 10.46576/djtechno.v5i2.4653.

C. J. L. Tobing, I. G. N. L. Wijayakusuma, and L. P. I. Harini, “Perbandingan Kinerja IndoBERT dan MBERT untuk Deteksi Berita Hoaks Politik dalam Bahasa Indonesia,” JST (Jurnal Sains dan Teknologi), vol. 14, no. 1, pp. 114–123, 2025, doi: 10.23887/jstundiksha.v14i1.92126.

C. Ramadhan, V. Atina, and H. Permatasari, “Analisis Perbandingan Model CNN dan IndoBERT Dalam Sentimen Berita Politik Indonesia,” in Prosiding Seminar Nasional Teknologi Informasi dan Bisnis, 2025, pp. 110–118. doi: 10.47701/v1r9ka69.

P. Sayarizki, H. Nurrahmi, and others, “Implementation of IndoBERT for Sentiment Analysis of Indonesian Presidential Candidates,” Indonesian Journal on Computing (Indo-JC), vol. 9, no. 2, pp. 61–72, 2024, doi: 10.34818/INDOJC.2024.9.2.934.

M. R. Rabbani, H. M. Manik, and T. Hestirianoto, “Klasifikasi Gelembung Gas Menggunakan Multibeam Echosounder dan Machine Learning,” Jurnal Kelautan Tropis, vol. 28, no. 2, pp. 247–254, 2025, doi: 10.14710/jkt.v28i2.26778.

A. M. Andrés and M. Á. Hernández, “Estimators of Various Kappa Coefficients Based on the Unbiased Estimator of the Expected Index of Agreements,” Adv Data Anal Classif, vol. 19, no. 1, pp. 177–207, 2025, doi: 10.1007/s11634-024-00581-x.

L. Yang, X. Zhou, J. Fan, X. Xie, and S. Zhu, “Can Bidirectional Encoder Become the Ultimate Winner for Downstream Applications of Foundation Models?,” in 2024 2nd International Conference on Foundation and Large Language Models (FLLM), 2024, pp. 526–534. doi: 10.1109/FLLM63129.2024.10852511.

J. Sun, Y. Liu, J. Cui, and H. He, “Deep Learning-based Methods for Natural Hazard Named Entity Recognition,” Sci Rep, vol. 12, p. 4598, Jun. 2022, doi: 10.1038/s41598-022-08667-2.

W. Wongso, D. S. Setiawan, S. Limcorn, and A. Joyoadikusumo, “NusaBERT: Teaching IndoBERT to be Multilingual and Multicultural,” in Proceedings of the Second Workshop in South East Asian Language Processing, D. Wijaya, A. F. Aji, C. Vania, G. I. Winata, and A. Purwarianti, Eds., Association for Computational Linguistics, 2025, pp. 10–26.

N. P. I. Maharani, A. Purwarianti, Y. Yustiawan, and F. C. Rochim, “Domain-Specific Language Model Post-Training for Indonesian Financial NLP,” in 2023 International Conference on Electrical Engineering and Informatics (ICEEI), 2023, pp. 1–6. doi: 10.1109/ICEEI59426.2023.10346625.

I. Alam, G. Nabiilah, E. S. Purwanto, and M. F. Hidayat, “Indonesian Multilabel Classification Using IndoBERT Embedding and MBERT Classification,” International Journal of Electrical and Computer Engineering (IJECE), vol. 14, no. 1, pp. 1071–1078, Jun. 2024, doi: 10.11591/ijece.v14i1.pp1071-1078.

G. M. Foody, “Challenges in the Real World Use of Classification Accuracy Metrics: From Recall and Precision to the Matthews Correlation Coefficient,” PLoS One, vol. 18, no. 10, p. e0291908, 2023, doi: 10.1371/journal.pone.0291908.

Bila bermanfaat silahkan share artikel ini

Berikan Komentar Anda terhadap artikel Kinerja Metode Fine-Tuning IndoBERT untuk Klasifikasi Emosi Multi-Kelas pada Teks Informal Bahasa Indonesia

Kinerja Metode Fine-Tuning IndoBERT untuk Klasifikasi Emosi Multi-Kelas pada Teks Informal Bahasa Indonesia

Authors

DOI:

Keywords:

Abstract

Downloads

References

ARTICLE HISTORY

How to Cite

Issue

Section

Most read articles by the same author(s)