Assessing the Performance of Models from the 2022 RSNA Cervical Spine Fracture Detection Competition at a Level I Trauma Center.
Document Type
Article
Publication Date
11-1-2024
Original Citation
Hu Z,
Patel M,
Ball R,
Lin H,
Prevedello L,
Naseri M,
Mathur S,
Moreland R,
Wilson J,
Witiw C,
Yeom K,
Ha Q,
Hanley D,
Seferbekov S,
Chen H,
Singer P,
Henkel C,
Pfeiffer P,
Pan I,
Sheoran H,
Li W,
Flanders A,
Kitamura F,
Richards T,
Talbott J,
Sejdić E,
Colak E.
Assessing the Performance of Models from the 2022 RSNA Cervical Spine Fracture Detection Competition at a Level I Trauma Center. Radiol Artif Intell. 2024;6(6):e230550
Keywords
JMG, Humans, Male, Cervical Vertebrae, Middle Aged, Spinal Fractures, Trauma Centers, Tomography, X-Ray Computed, Retrospective Studies, Female, Sensitivity and Specificity, Adult, Contrast Media
JAX Source
Radiol Artif Intell. 2024;6(6):e230550
ISSN
2638-6100
PMID
39298563
DOI
https://doi.org/10.1148/ryai.230550
Abstract
Purpose To evaluate the performance of the top models from the RSNA 2022 Cervical Spine Fracture Detection challenge on a clinical test dataset of both noncontrast and contrast-enhanced CT scans acquired at a level I trauma center. Materials and Methods Seven top-performing models in the RSNA 2022 Cervical Spine Fracture Detection challenge were retrospectively evaluated on a clinical test set of 1828 CT scans (from 1829 series: 130 positive for fracture, 1699 negative for fracture; 1308 noncontrast, 521 contrast enhanced) from 1779 patients (mean age, 55.8 years ± 22.1 [SD]; 1154 [64.9%] male patients). Scans were acquired without exclusion criteria over 1 year (January-December 2022) from the emergency department of a neurosurgical and level I trauma center. Model performance was assessed using area under the receiver operating characteristic curve (AUC), sensitivity, and specificity. False-positive and false-negative cases were further analyzed by a neuroradiologist. Results Although all seven models showed decreased performance on the clinical test set compared with the challenge dataset, the models maintained high performances. On noncontrast CT scans, the models achieved a mean AUC of 0.89 (range: 0.79-0.92), sensitivity of 67.0% (range: 30.9%-80.0%), and specificity of 92.9% (range: 82.1%-99.0%). On contrast-enhanced CT scans, the models had a mean AUC of 0.88 (range: 0.76-0.94), sensitivity of 81.9% (range: 42.7%-100.0%), and specificity of 72.1% (range: 16.4%-92.8%). The models identified 10 fractures missed by radiologists. False-positive cases were more common in contrast-enhanced scans and observed in patients with degenerative changes on noncontrast scans, while false-negative cases were often associated with degenerative changes and osteopenia. Conclusion The winning models from the 2022 RSNA AI Challenge demonstrated a high performance for cervical spine fracture detection on a clinical test dataset, warranting further evaluation for their use as clinical support tools.