Document Type
Article
Publication Date
5-1-2020
Keywords
JGM
JAX Source
Bioinformatics 2020 May 1; 36(10):3234-3235
Volume
36
Issue
10
First Page
3234
Last Page
3235
ISSN
1367-4811
PMID
32044918
DOI
https://doi.org/10.1093/bioinformatics/btaa061
Grant
Jackson Laboratory Director's Innovation Fund, HG009409,DK107967
Abstract
MOTIVATION: Modern genomic research is driven by next-generation sequencing experiments such as ChIP-seq and ChIA-PET that generate coverage files for transcription factor binding, as well as DHS and ATAC-seq that yield coverage files for chromatin accessibility. Such files are in a bedGraph text format or a bigWig binary format. Obtaining summary statistics in a given region is a fundamental task in analyzing protein binding intensity or chromatin accessibility. However, the existing Python package for operating on coverage files is not optimized for speed.
RESULTS: We developed pyBedGraph, a Python package to quickly obtain summary statistics for a given interval in a bedGraph or a bigWig file. When tested on 12 ChIP-seq, ATAC-seq, RNA-seq and ChIA-PET datasets, pyBedGraph is on average 260 times faster than the existing program pyBigWig. On average, pyBedGraph can look up the exact mean signal of 1 million regions in ∼0.26 s and can compute their approximate means in
AVAILABILITY AND IMPLEMENTATION: pyBedGraph is publicly available at https://github.com/TheJacksonLaboratory/pyBedGraph under the MIT license.
SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
Recommended Citation
Zhang H,
Kim M,
Chuang J,
Ruan Y.
pyBedGraph: a python package for fast operations on 1D genomic signal tracks. Bioinformatics 2020 May 1; 36(10):3234-3235
Comments
This article is distributed under the terms of the Creative Commons Attribution-NonCommercial 4.0 License.