Document Type
Article
Publication Date
7-2023
Original Citation
Cappelletti L,
Fontana T,
Casiraghi E,
Ravanmehr V,
Callahan TJ,
Cano C,
Joachimiak MP,
Mungall CJ,
Robinson P,
Reese J,
Valentini G.
GRAPE for fast and scalable graph processing and random-walk-based embedding Nat Comput Sci. 2023;3(6):552-68.
Keywords
JGM
JAX Source
Nat Comput Sci. 2023;3(6):552-68.
DOI
https://doi.org/10.1038/s43588-023-00465-8
Grant
This research was supported by the National Center for Gene Therapy and Drugs based on RNA Technology, PNRR-NextGenerationEU program (G43C22001320007), NIH/National Cancer Institute (U01-CA239108-02), Transition Grant Line 1A Project ‘NIMI PARTENARIATI H2020’ (PSR2015-1720GVALE_01), the Common Fund, Office of the Director, National Institutes of Health (U01-CA239108-02), the Monarch Initiative, National Institute of Health (1R24OD011883-01), Project PID2021-128970OA-I00 by MCIN/AEI/10.13039/501100011033/ FEDER, and the Director, Office of Science, Office of Basic Energy Sciences of the US Department of Energy under contract no. DE-AC02- 05CH11231.
Abstract
Graph representation learning methods opened new avenues for addressing complex, real-world problems represented by graphs. However, many graphs used in these applications comprise millions of nodes and billions of edges and are beyond the capabilities of current methods and software implementations. We present GRAPE (Graph Representation Learning, Prediction and Evaluation), a software resource for graph processing and embedding that is able to scale with big graphs by using specialized and smart data structures, algorithms, and a fast parallel implementation of random-walk-based methods. Compared with state-of-the-art software resources, GRAPE shows an improvement of orders of magnitude in empirical space and time complexity, as well as competitive edge- and node-label prediction performance. GRAPE comprises approximately
1.7 million well-documented lines of Python and Rust code and provides 69 node-embedding methods, 25 inference models, a collection of efficient graph-processing utilities, and over 80,000 graphs from the literature and other sources. Standardized interfaces allow a seamless integration of third- party libraries, while ready-to-use and modular pipelines permit an easy-to- use evaluation of graph-representation-learning methods, therefore also positioning GRAPE as a software resource that performs a fair comparison between methods and libraries for graph processing and embedding.
Comments
This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/ licenses/by/4.0/.