Paper Details
Abstract
The COVID-19 pandemic has caused enormous challenges for medical experts in identifying the abnormalities from radiography images due to the lack of pre-labeled training data. Consequently, the demand for comprehensive labeled datasets and effective analytical methods for radiographic analysis increased dramatically, particularly in scenarios involving previously unseen diseases. In this study, we proposed DenseTransXR, a hybrid architecture that integrates the DenseNet-121 feature extractor with the global contextual modeling of Transformer encoders via Multi-Head Self Attention. Our model achieved an AUC of 0.812 for multi-label abnormality detection on NIH ChestX-ray14, outperforming both CNN-based and hybrid baselines, as well as an AUC of 0.755 and a recall of 0.919 for zero-shot COVID-19 detection on the COVIDx CXR-4 dataset, demonstrating strong generalization in identifying previously unseen diseases. These results showed the potentials of hybrid Transformer models in improving diagnostic performance and providing scalable solutions for future pandemics.