MouseScholar: Evaluating an Image+Text Search System for Biocuration
Document Type
Article
Publication Date
2023
Original Citation
Trabucco JT.
MouseScholar: Evaluating an Image+Text Search System for Biocuration Proceedings (IEEE Int Conf Bioinformatics Biomed). 2023:1473-80.
Keywords
JMG
JAX Source
Proceedings (IEEE Int Conf Bioinformatics Biomed). 2023:1473-80.
DOI
https://doi.org/10.1109/BIBM58861.2023.10385503
Grant
We thank the biocurators that participated in this study. This work was supported by the awards from the U.S. National In- stitutes of Health (NLM R01LM012527, NCI R01CA258827) and the U.S. National Science Foundation (CNS-1828265, CDSE-1854815).
Abstract
Biocuration is the process of analyzing biological or biomedical articles to organize biological data into data repositories using taxonomies and ontologies. Due to the ex- panding number of articles and the relatively small number of biocurators, automation is desired to improve the workflow of assessing articles worth curating. As figures convey essential information, automatically integrating images may improve cu- ration. In this work, we instantiate and evaluate a first-in-kind, hybrid image+text document search system for biocuration. The system, MouseScholar, leverages an image modality taxonomy derived in collaboration with biocurators, in addition to figure segmentation, and classifiers components as a back-end and a streamlined front-end interface to search and present document results. We formally evaluated the system with ten biocurators on a mouse genome informatics biocuration dataset and collected feedback. The results demonstrate the benefits of blending text and image information when presenting scientific articles for biocuration.