Paper Details
Abstract
Vietnamese e-commerce platforms have witnessed a significant number of spam reviews, which undermine consumer trust and product evaluations. Spam review detection mod els for Vietnamese in particular remain limited. In this study, we propose a modern approach to tackling this issue using large language models fine-tuned via low-rank adap tation. The base models chosen were Meta’s Llama 3.1 8B and 70B Instruct, and Llama 3.2 1B Instruct, which were fine-tuned and compared against PhoBERT on the ViS pamReviewsV2 dataset. In terms of accuracy, the fine-tuned Llama models outperform PhoBERT in both binary and multi-class classification across all model sizes. While PhoBERT’s F1 score on binary classification is slightly higher, the fine-tuned Llama models show noticeable improvements on the macro-F1 score for multi-class classifica tion. The results also demonstrate that fine-tuning improved the performance of the pre-trained Llama models substantially. These findings highlight modern LLMs’ poten tial in specialized tasks for low-resource languages, especially when given high-quality data, thorough pre-training, and effective fine-tuning.