Document Type
Article
Publication Date
12-20-2024
Original Citation
Zhao Y,
Agyemang D,
Liu Y,
Mahoney J,
Li S.
Quantifying interpretation reproducibility in Vision Transformer models with TAVAC. Sci Adv. 2024;10(51):eabg0264.
Keywords
JGM, JMG, Humans, Reproducibility of Results, Deep Learning, Algorithms, Breast Neoplasms, Image Processing, Computer-Assisted, Female, Neural Networks, Computer, Image Interpretation, Computer-Assisted
JAX Source
Sci Adv. 2024;10(51):eabg0264.
ISSN
2375-2548
PMID
39705362
DOI
https://doi.org/10.1126/sciadv.abg0264
Grant
S.l. is supported by the following grants from national institute of health: R35GM133562 (2019-2024), U01hG013175 (2023-2028), U01cA271830 (2021-2026), and R56AG071766 (2022-2024). S.l. is a recipient of career Development Award (1398-25) of the leukemia & lymphoma Society (2024-2029). M.M. is supported by R01GM141309. M.M. and S.l. are supported by U54AG079753 (2022-2026). D.A.’s Summer Student Fellowship was supported by the national cancer institute (award R25cA233420) and the Jackson laboratory’s Summer Student Program fund
Abstract
Deep learning algorithms can extract meaningful diagnostic features from biomedical images, promising improved patient care in digital pathology. Vision Transformer (ViT) models capture long-range spatial relationships and offer robust prediction power and better interpretability for image classification tasks than convolutional neural network models. However, limited annotated biomedical imaging datasets can cause ViT models to overfit, leading to false predictions due to random noise. To address this, we introduce Training Attention and Validation Attention Consistency (TAVAC), a metric for evaluating ViT model overfitting and quantifying interpretation reproducibility. By comparing high-attention regions between training and testing, we tested TAVAC on four public image classification datasets and two independent breast cancer histological image datasets. Overfitted models showed significantly lower TAVAC scores. TAVAC also distinguishes off-target from on-target attentions and measures interpretation generalization at a fine-grained cellular level. Beyond diagnostics, TAVAC enhances interpretative reproducibility in basic research, revealing critical spatial patterns and cellular structures of biomedical and other general nonbiomedical images.
Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 International License.