Enhancing Hi-C data resolution with deep
convolutional neural network HiCPlus
, Lin An
, Jie Xu
, Bo Zhang
, W. Jim Zheng
, Ming Hu
, Jijun Tang
& Feng Yue
Although Hi-C technology is one of the most popular tools for studying 3D genome
organization, due to sequencing cost, the resolution of most Hi-C datasets are coarse and
cannot be used to link distal regulatory elements to their target genes. Here we develop
HiCPlus, a computational approach based on deep convolutional neural network, to infer
high-resolution Hi-C interaction matrices from low-resolution Hi-C data. We demonstrate
that HiCPlus can impute interaction matrices highly similar to the original ones, while only
using 1/16 of the original sequencing reads. We show that the models learned from one cell
type can be applied to make predictions in other cell or tissue types. Our work not
only provides a computational framework to enhance Hi-C data resolution but also reveals
features underlying the formation of 3D chromatin interactions.
Department of Computer Science and Engineering, University of South Carolina, Columbia, SC 29208, USA.
Bioinformatics and Genomics Program, Huck
Institutes of the Life Sciences, The Pennsylvania State University, University Park, PA 16802, USA.
Department of Biochemistry and Molecular Biology,
College of Medicine, The Pennsylvania State University, Hershey, PA 17033, USA.
School of Biomedical Informatics, University of Texas Health Science
Center at Houston, Houston, TX 77030, USA.
Department of Quantitative Health Sciences, Lerner Research Institute, Cleveland Clinic Foundation,
Cleveland, OH 44195, USA.
School of Computer Science and Technology, Tianjin University, 300072 Tianjin, China.
Tianjin University Institute of
Computational Biology, Tianjin University, 300072 Tianjin, China. Yan Zhang and Lin An contributed equally to this work. Correspondence and requests for
materials should be addressed to J.T. (email: JTang@cse.sc.edu) or to F.Y. (email: firstname.lastname@example.org)