Implementasi Langchain dan Large Language Models Dalam Automatic Question Generation Untuk Computer Assisted Test
DOI:
https://doi.org/10.47065/bulletincsr.v5i4.558Keywords:
Automatic Question Generation; Computer Assisted Test; Large Language Models; LangChain; GPT-4oAbstract
The advancement of Artificial Intelligence (AI), particularly Large Language Models (LLM), presents new opportunities in transforming educational assessment systems. This study aims to implement the LangChain framework integrated with LLM for an Automatic Question Generation (AQG) system within a Computer Assisted Test (CAT) platform, using eleventh-grade Biology subject matter as a case study. The methodology includes data collection from PDF-based instructional materials, text embedding using Facebook AI Similarity Search (FAISS) as the knowledge base, and automatic question generation through the GPT-4o model. The system is developed using a microservices architecture comprising frontend and backend services built with the Next.js, FastAPI, and Express.js frameworks. System evaluation was conducted using the User Acceptance Test (UAT) and the DeepEval framework. The evaluation results show a teacher satisfaction rate of 92.7% and a positive response from students at 67.5%. Meanwhile, the DeepEval assessment reported average scores of 3,69% for hallucination, 97,44% for contextual precision, 83,30% for contextual relevancy, 70,63% for answer relevancy, and 92,47% for prompt alignment. These findings indicate that the integration of LangChain and LLM is effective in generating contextually accurate and relevant questions, although improvements are still needed in answer relevancy. This study is expected to provide an efficient solution for digital-based educational assessment and contribute to future developments in educational AI.
Downloads
References
L. Chen, P. Chen, and Z. Lin, “Artificial Intelligence in Education: A Review,” IEEE Access, vol. 8, pp. 75264–75278, 2020, doi: 10.1109/ACCESS.2020.2988510.
M. Holland and K. Chaudhari, “Large language model based agent for process planning of fiber composite structures,” Manuf Lett, vol. 40, pp. 100–103, Jul. 2024, doi: 10.1016/j.mfglet.2024.03.010.
F. Hans-Georg, F. Peter, and K. Julius, “Conceptual Modeling and Large Language Models: Impressions From First ExperimentsWith ChatGPT,” Enterprise Modelling and Information Systems Architectures, vol. 18, Jan. 2023, doi: 10.18417/emisa.18.3.
S. A. M. Hogenboom, F. F. J. Hermans, and H. L. J. Van der Maas, “Computerized adaptive assessment of understanding of programming concepts in primary school children,” Computer Science Education, vol. 32, no. 4, pp. 418–448, 2022, doi: 10.1080/08993408.2021.1914461.
A. Maharani, R. Habib Adibarata, T. Anggara, and Y. Hanoselina, “Efektivitas Penggunaan Sistem Cat Dalam Penerimaan Pegawai Negeri Sipil Di Upt Bkn Padang,” Jurnal Ilmu Manajemen, Bisnis dan Ekonomi, vol. 2, no. 3, 2024, doi: doi.org/10.59971/jimbe.v2i3.359.
K. B. Utomo, A. Azizah, and M. A. Pangestu, “Peran Computer Assited Test dalam Implementasi Penilaian di SD Negeri 005 Palaran,” Jurnal Ilmu Siber dan Teknologi Digital, vol. 1, no. 1, pp. 29–39, Nov. 2022, doi: 10.35912/jisted.v1i1.1529.
E. P. Saputra, R. N. Alfiyah, and I. Indriyanti, “Computer Assessment Test at the Association of Indonesian Independent Housing Experts with Waterfall Model,” Jurnal CoreIT: Jurnal Hasil Penelitian Ilmu Komputer dan Teknologi Informasi, vol. 9, no. 1, p. 29, Jun. 2023, doi: 10.24014/coreit.v9i1.11483.
R. Setiawan, “Optimasi Pengalaman Pengguna Dan Prototyping Untuk Penilaian Otomatis Dan Pencegahan Kecurangan,” bit-Tech, vol. 7, no. 2, pp. 299–306, Dec. 2024, doi: 10.32877/bt.v7i2.1758.
I. A. Buana, M. Yunus, and S. Suratman, “Implementasi Sistem Computer-Based Test (CBT) Dalam Pengelolaan Ujian di MAN Insan Cendekia Paser,” Jurnal Tarbiyah dan Ilmu Keguruan Borneo, vol. 5, no. 2, pp. 219–228, Mar. 2024, doi: 10.21093/jtikborneo.v5i2.7822.
S. Izadi and M. Forouzanfar, “Error Correction and Adaptation in Conversational AI: A Review of Techniques and Applications in Chatbots,” AI (Switzerland), vol. 5, no. 2, pp. 803–841, Jun. 2024, doi: 10.3390/ai5020041.
B. Ogunleye, K. I. Zakariyyah, O. Ajao, O. Olayinka, and H. Sharma, “A Systematic Review of Generative AI for Teaching and Learning Practice,” Educ Sci (Basel), vol. 14, no. 6, Jun. 2024, doi: 10.3390/educsci14060636.
N. S. Harahap, A. Saad, and H. Ubaidullah, “Comprehensive Bibliometric Literature Review of Chatbot Research: Trends, Frameworks, and Emerging Applications,” IJACSA) International Journal of Advanced Computer Science and Applications, vol. 16, no. 1, p. 2025, doi: 10.14569/IJACSA.2025.0160185.
G. Roffo, “Exploring Advanced Large Language Models with LLMsuite,” Arxiv, Jul. 2024, doi: 10.13140/RG.2.2.11774.80963.
R. P. Kiran, S. Khaiyum, A. R. Palandye, and A. S. D, “Leveraging LLaMA3 and LangChain for Rapid AI Application Development,” J. Electrical Systems, vol. 20, no. 10, pp. 2146–2153, 2024, doi: 10.52783/jes.5539.
M. I. Syah, “Penerapan Retrieval Augemented Generation Menggunakan Langchain Dalam Pengembangan Sistem Tanya Jawab Hadis Berbasis Web,” Zonasi, vol. 6, no. 2, 2024, doi: https://doi.org/10.31849/zn.v6i2.19940.
L. Pusch and T. O. F. Conrad, “Combining LLMs and Knowledge Graphs to Reduce Hallucinations in Question Answering,” ArXiv, Nov. 2024, doi: doi.org/10.48550/arXiv.2409.04181.
S. Maity, A. Deroy, and S. Sarkar, “Leveraging In-Context Learning and Retrieval-Augmented Generation for Automatic Question Generation in Educational,” Proceedings of ACM Conference, 2025, doi: 10.48550/arXiv.2501.17397.
S. Shahriar et al., “Putting GPT-4o to the Sword: A Comprehensive Evaluation of Language, Vision, Speech, and Multimodal Proficiency,” Applied Sciences (Switzerland), vol. 14, no. 17, Sep. 2024, doi: 10.3390/app14177782.
V. Patel, “Analyzing the Impact of Next.JS on Site Performance and SEO,” International Journal of Computer Applications Technology and Research, vol. 12, no. 10, pp. 24–27, 2023, doi: 10.7753/ijcatr1210.1004.
H. A. Jartarghar, G. R. Salanke, A. K. A.R, and S. Dalali, “React Apps with Server-Side Rendering: Next.js,” Journal of Telecommunication, Electronic and Computer Engineering, vol. 14, no. 4, Dec. 2022, doi: 10.54554/jtec.2022.14.04.005.
A. N. Safitri and I. Harkespan, “Pengembangan Web Service Menggunakan Framework Fastapi Untuk Meningkatkan Kemudahan Integrasi Sistem Informasi Akademik Multiplatform,” Jurnal Teknoif Teknik Informatika Institut Teknologi Padang, vol. 12, no. 2, pp. 149–157, Oct. 2024, doi: 10.21063/jtif.2024.V12.2.149-157.
A. T. Saputro and M. Novita, “Comparative Analysis of Express and Hono Framework Performance in Simple Registration Application,” sinkron, vol. 9, no. 1, pp. 406–412, Jan. 2025, doi: 10.33395/sinkron.v9i1.14333.
P. Pujianto, M. Mujito, D. Prabowo, and B. H. Prasetyo, “Pemilihan Warga Penerima Bantuan Program Keluarga Harapan (PKH) Menggunakan Metode Simple Additive Weighting (SAW) dan User Acceptance Testing (UAT),” Jurnal Informatika Universitas Pamulang, vol. 5, no. 3, p. 379, Sep. 2020, doi: 10.32493/informatika.v5i3.6671.
B. Simamora, “Skala Likert, Bias Penggunaan dan Jalan Keluarnya,” Jurnal Manajemen, vol. 12, no. 1, pp. 84–93, Nov. 2022, doi: 10.46806/jman.v12i1.978.
T. Dharmawan and A. Witanti, “Evaluasi Llama3.2 3b Untuk Menghasilkan Soal Otomatis Dengan Deepeval Berdasarkan Metrik Answer Relevancy Dan Hallucination,” Jurnal Informatika Teknologi dan Sains, vol. 7, no. 1, pp. 242–248, 2025, doi: 10.51401/jinteks.v7i1.5423.
A. B. Permadi, N. H Safaat, L. Handayani, and Yusra, “Implementasi Question Answering System Tafsir Al-Azhar Menggunakan Langchain Dan Large Language Model Berbasis Chatbot Telegram,” Jurnal Teknoif Teknik Informatika Institut Teknologi Padang, vol. 12, no. 1, pp. 62–69, Apr. 2024, doi: 10.21063/jtif.2024.v12.1.62-69.
T. Dharmawan and A. Witanti, “Evaluasi Llama3.2 3b Untuk Menghasilkan Soal Otomatis Dengan Deepeval Berdasarkan Metrik Answer Relevancy Dan Hallucination,” Jurnal Informatika Teknologi dan Sains, vol. 7, no. 1, pp. 242–248, 2025, doi: 10.51401/jinteks.v7i1.5423.
Bila bermanfaat silahkan share artikel ini
Berikan Komentar Anda terhadap artikel Implementasi Langchain dan Large Language Models Dalam Automatic Question Generation Untuk Computer Assisted Test
ARTICLE HISTORY
How to Cite
Issue
Section
Copyright (c) 2025 Novri Rahman, Nazruddin Safaat Harahap, Muhammad Affandes, Pizaini

This work is licensed under a Creative Commons Attribution 4.0 International License.
Authors who publish with this journal agree to the following terms:
- Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under Creative Commons Attribution 4.0 International License that allows others to share the work with an acknowledgment of the work's authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgment of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (Refer to The Effect of Open Access).