VCED: Fine-Tuning a Two-Stage RAG Architecture for High-Fidelity Cosmetics Consultation in Vietnamese

Paper Information

Paper ID: FETC25-082

Type: Full Paper

Date: Sep 10, 2025

Status: Accepted

Paper Details

Abstract

The cosmetics industry lacks personalized, science-backed advisory systems, particularly for low-resource languages like Vietnamese. This paper addresses this gap by introducing and evaluating a novel two-stage Retrieval-Augmented Generation (RAG) system. Our approach leverages a state-of-the-art architecture combining a high-recall Bi-Encoder Retriever with a high-precision Cross-Encoder Re-ranker, forming a robust pipeline for specialized information retrieval. To enable this research, we created the Vietnamese Cosmetics E-commerce Dataset (VCED), a new, publicly available corpus of 9,173 canonical products derived from 11,609 raw e-commerce listings via a rigorous ``Funnel Strategy'' for data cleaning and entity resolution. The system's core components are language-specific models fine-tuned for the Vietnamese context. Experimental results demonstrate the decisive advantage of this specialization, with the Retriever achieving 99.92\% triplet accuracy and the Re-ranker reaching 99.74\% Average Precision. Most critically, end-to-end evaluation confirms that the re-ranking stage is indispensable; its inclusion more than doubled the Mean Reciprocal Rank (MRR) to 0.585 and improved the Hits@1 score from zero to 0.473. Successfully deployed and validated as a Facebook Messenger chatbot, this work not only establishes a new performance benchmark for domain-specific conversational AI in Vietnamese but also provides a production-ready blueprint for applying advanced RAG architectures in non-English, low-resource environments.

Keywords

Retrieval-Augmented Generation (RAG) Two-Stage Information Retrieval Semantic Search Conversational AI Low-Resource Language

Contact Information

Le Anh Tien (Corresponding Author)

Affiliation: Ministry of Science and Technology

Country: Vietnam

Email: daoquangthuyukb@gmail.com

Phone: 0389081824

Back to Accepted Papers

Latest News

There are no new news updates at the moment.

Important dates

Submission Deadline: ~~June 30, 2025~~ July 31, 2025 (Firm Deadline)
Notification of Acceptance: August 15, 2025
Camera Ready Submission: September 10, 2025
Registration Deadline and Fee Payment: September 15, 2025
Conference Dates: October 25-26, 2025

Conference Fee

International Authors/Listeners

Registration Type	Region	Inclusive Package
Registration Type	International	Include Gala dinner	Include Academic tour
Author (Regular)	300 USD	Yes	Yes
Author (Student)	250 USD	Yes	Yes
Author (Industry/Poster)	300 USD	Yes	Yes
Listener	100 USD	Yes	Yes

Domestic Authors/Listeners

Registration Type	Region	Inclusive Package
Registration Type	Vietnam	Include Gala dinner	Include Academic tour
Author (Regular)	5,000,000 VND	Yes	Yes
Author (Student)	4,500,000 VND	Yes	Yes
Author (Industry/Poster)	5,000,000 VND	Yes	Yes
Listener	1,000,000 VND	Yes	Yes

Contact

Website: science.fpt.edu.vn/fetc
Phone: +84 2466549806
Email: FETC@fe.edu.vn

Keynote Speakers

Prof. Natalia Loukachevitch
Lomonosov Moscow State University (MSU), Russia
Prof. Long Tran-Thanh
University of Warwick, United Kingdom
Dr. Long Duong
Oracle, Australia

Conference Themes

AI Solution for Developing Countries

Data Availability and Quality
Energy Efficiency and Optimization
Edge Computing and Decentralization
NLP for Low-resource Languages
Image and Video Understanding
Machine Learning Applications

Important dates

Submission Deadline: ~~June 30, 2025~~ July 31, 2025 (Firm Deadline)
Notification of Acceptance: August 15, 2025
Camera Ready Submission: September 10, 2025
Registration Deadline and Fee Payment: September 15, 2025
Conference Dates: October 25-26, 2025

Version: 1.0.9428.17720