Synthetic DNA spike-ins (SDSIs) enable sample tracking and detection of inter-sample contamination in SARS-CoV-2 sequencing workflows.

Kim A Lagerborg
Erica Normandin
Matthew R Bauer
Gordon Adams
Katherine Figueroa
Christine Loreth
Adrianne Gladden-Young
Bennett M Shaw
Leah R Pearlman
Daniel Berenzy, The Jackson Laboratory
Hannah B Dewey, The Jackson Laboratory
Susan Kales, The Jackson Laboratory
Sabrina T Dobbins
Erica S Shenoy
David Hooper
Virginia M Pierce
Kimon C Zachary
Daniel J Park
Bronwyn L MacInnis
Ryan Tewhey, The Jackson Laboratory
Jacob E Lemieux
Pardis C Sabeti
Steven K Reilly
Katherine J Siddle


The global spread and continued evolution of SARS-CoV-2 has driven an unprecedented surge in viral genomic surveillance. Amplicon-based sequencing methods provide a sensitive, low-cost and rapid approach but suffer a high potential for contamination, which can undermine laboratory processes and results. This challenge will increase with the expanding global production of sequences across a variety of laboratories for epidemiological and clinical interpretation, as well as for genomic surveillance of emerging diseases in future outbreaks. We present SDSI + AmpSeq, an approach that uses 96 synthetic DNA spike-ins (SDSIs) to track samples and detect inter-sample contamination throughout the sequencing workflow. We apply SDSIs to the ARTIC Consortium's amplicon design, demonstrate their utility and efficiency in a real-time investigation of a suspected hospital cluster of SARS-CoV-2 cases and validate them across 6,676 diagnostic samples at multiple laboratories. We establish that SDSI + AmpSeq provides increased confidence in genomic data by detecting and correcting for relatively common, yet previously unobserved modes of error, including spillover and sample swaps, without impacting genome recovery.