Paper Details
Abstract
Depression is a major concern among university students, yet often remains undiagnosed due to the limitations of traditional categorical assessment tools. Advances in Natural Language Processing (NLP) enable alternative approaches with improved diagnostic accuracy. In this study, we propose a hybrid, transfer learning model that combines MentalBERT-based Sentence-BERT (SBERT) for semantic encoding with a Bidirectional Long Short-Term Memory (Bi-LSTM) network for sequence regression, aiming to predict Patient Health Questionnaire-9 (PHQ-9) scores from free-text responses of 250 students. Evaluation is performed using both regression and classification metrics. The model achieved strong regression performance (MAE = 1.453, RMSE = 1.827) and outstanding classification results, with accuracy, precision, recall, and F1-score reaching approximately 98.8%. For evaluation, ablation studies are performed, which highlight the use of BiLSTM as the most significant contributor to the model’s performance. Benchmarking against state-of-the-art transformers further confirmed the robustness of the MentalBERT-based SBERT encoder, highlighting its practical utility for automated depression screening.