PEDIA: prioritization of exome data by image analysis.

Tzung-Chien Hsieh
Martin A Mensah
Jean T Pantel
Dione Aguilar
Omri Bar
Allan Bayat
Luis Becerra-Solano
Heidi B Bentzen
Saskia Biskup
Oleg Borisov
Oivind Braaten
Claudia Ciaccio
Marie Coutelier
Kirsten Cremer
Magdalena Danyel
Svenja Daschkey
Hilda David Eden
Koenraad Devriendt
Sandra Wilson
Sofia Douzgou
Dejan Đukić
Nadja Ehmke
Christine Fauth
Björn Fischer-Zirnsak
Nicole Fleischer
Heinz Gabriel
Luitgard Graul-Neumann
Karen W Gripp
Yaron Gurovich
Asya Gusina
Nechama Haddad
Nurulhuda Hajjir
Yair Hanani
Jakob Hertzberg
Konstanze Hoertnagel
Janelle Howell
Ivan Ivanovski
Angela Kaindl
Tom Kamphans
Susanne Kamphausen
Catherine Karimov
Hadil Kathom
Anna Keryan
Alexej Knaus
Sebastian Köhler
Uwe Kornak
Alexander Lavrov
Maximilian Leitheiser
Gholson J Lyon
Elisabeth Mangold
Purificación Marín Reina
Antonio Martinez Carrascal
Diana Mitter
Laura Morlan Herrador
Guy Nadav
Markus Nöthen
Alfredo Orrico
Claus-Eric Ott
Kristen Park
Borut Peterlin
Laura Pölsler
Annick Raas-Rothschild
Linda Randolph
Nicole Revencu
Christina Ringmann Fagerberg
Peter N Robinson, The Jackson Laboratory
Stanislav Rosnev
Sabine Rudnik
Gorazd Rudolf
Ulrich Schatz
Anna Schossig
Max Schubach
Or Shanoon
Eamonn Sheridan
Pola Smirin-Yosef
Malte Spielmann
Eun-Kyung Suk
Yves Sznajer
Christian T Thiel
Gundula Thiel
Alain Verloes
Irena Vrecar
Dagmar Wahl
Ingrid Weber
Korina Winter
Marzena Wiśniewska
Bernd Wollnik
Ming W Yeung
Max Zhao
Na Zhu
Johannes Zschocke
Stefan Mundlos
Denise Horn
Peter M Krawitz

Abstract

PURPOSE: Phenotype information is crucial for the interpretation of genomic variants. So far it has only been accessible for bioinformatics workflows after encoding into clinical terms by expert dysmorphologists.

METHODS: Here, we introduce an approach driven by artificial intelligence that uses portrait photographs for the interpretation of clinical exome data. We measured the value added by computer-assisted image analysis to the diagnostic yield on a cohort consisting of 679 individuals with 105 different monogenic disorders. For each case in the cohort we compiled frontal photos, clinical features, and the disease-causing variants, and simulated multiple exomes of different ethnic backgrounds.

RESULTS: The additional use of similarity scores from computer-assisted analysis of frontal photos improved the top 1 accuracy rate by more than 20-89% and the top 10 accuracy rate by more than 5-99% for the disease-causing gene.

CONCLUSION: Image analysis by deep-learning algorithms can be used to quantify the phenotypic similarity (PP4 criterion of the American College of Medical Genetics and Genomics guidelines) and to advance the performance of bioinformatics pipelines for exome analysis.