Document Type

Article

Publication Date

7-1-2023

Keywords

JGM, Humans, Pattern Recognition, Automated, COVID-19, Biological Ontologies, Rare Diseases, Machine Learning

JAX Source

Bioinformatics. 2023;39(7)

ISSN

1367-4811

PMID

37389415

DOI

https://doi.org/10.1093/bioinformatics/btad418

Grant

This work was supported by the Monarch Initiative [National Institute of Health/OD #5R24OD011883]; the Phenomics First Resource, a Center of Excellence in Genomic Science [National Institute of Health/National Human Genome Research Institute #1RM1HG010860-01]; Illuminating the Druggable Genome by Knowledge Graphs [National Institute of Health/National Cancer Institue #1U01CA239108-01]; and BioPortal [National Institute of Health/National Institute of General Medical Sciences U24 GM143402]. J.H.C., J.T.R., N.L.H., M.P.J., S.C., and S.A.T.M. were supported in part by the Director, Office of Science, Office of Basic Energy Sciences, of the U.S. Department of Energy under Contract No. DE-AC02-05CH11231. P.N.R. was supported by National Institute of Health/National Human Genome Research Institute 5U24HG011449-02. A.E.T. was supported by the GenoPhenoEnvo project National Science Foundation 1940330. D.R.U., S.A.T.M., R.M.B., N.L.H., M.C.M.-T., M.A.H., and C.J.M. were supported by National Institute of Health/National Center for Advancing Translational Sciences through the Biomedical Data Translator Program [OT2TR003449].

Abstract

MOTIVATION: Knowledge graphs (KGs) are a powerful approach for integrating heterogeneous data and making inferences in biology and many other domains, but a coherent solution for constructing, exchanging, and facilitating the downstream use of KGs is lacking.

RESULTS: Here we present KG-Hub, a platform that enables standardized construction, exchange, and reuse of KGs. Features include a simple, modular extract-transform-load pattern for producing graphs compliant with Biolink Model (a high-level data model for standardizing biological data), easy integration of any OBO (Open Biological and Biomedical Ontologies) ontology, cached downloads of upstream data sources, versioned and automatically updated builds with stable URLs, web-browsable storage of KG artifacts on cloud infrastructure, and easy reuse of transformed subgraphs across projects. Current KG-Hub projects span use cases including COVID-19 research, drug repurposing, microbial-environmental interactions, and rare disease research. KG-Hub is equipped with tooling to easily analyze and manipulate KGs. KG-Hub is also tightly integrated with graph machine learning (ML) tools which allow automated graph ML, including node embeddings and training of models for link prediction and node classification.

AVAILABILITY AND IMPLEMENTATION: https://kghub.org.

Comments

This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.

Share

COinS