Paper Details

Abstract

In this work, we present a Tiny Transformer architecture optimized for real-time glossto-text translation. Unlike conventional Transformer models, our approach explicitly addresses the structural and grammatical divergences between sign language glosses and natural language text, enabling efficient translation in low-resource settings. To bridge this gap, we introduce a pre-processing pipeline that includes gloss prefix removal, whquestion reordering, auxiliary verb insertion, and syntactic restructuring. These steps help the model learn more effective mappings from gloss sequences to fluent English sentences. Experimental evaluations show that our Tiny Transformer achieves competitive BLEU score improvements of up to 5 points compared to the original Transformer, while reducing inference latency by half (35 ms on average) relative to the 4-layer model. This efficiency makes it highly suitable for deployment on edge devices and in real-time applications.

Keywords
sign language translation tiny Transformer architecture Gloss2Text latency
Contact Information
NGUYEN XUAN SAM (Corresponding Author)
Swinburne Vietnam, FPT University, Vietnam
0969938284

All Authors (1)

NGUYEN XUAN SAM C

Affiliation: Swinburne Vietnam, FPT University

Country: Vietnam

Email: samnx2@fpt.edu.vn

Phone: 0969938284