TY - JOUR
T1 - A Bayesian data fusion based approach for learning genome-wide transcriptional regulatory networks
AU - Sauta, Elisabetta
AU - Demartini, Andrea
AU - Vitali, Francesca
AU - Riva, Alberto
AU - Bellazzi, Riccardo
N1 - Funding Information:
The work is part of the project “Rete Ematologica Lombarda (REL) biotechnology cluster for the implementation of genomic analysis and the development of innovative treatments in hematological malignancies”, jointly funded by the Fondazione Cariplo and the Regione Lombardia (project ID 2013–0387). The funding bodies did not play any roles in the design of the study and collection, analysis, and interpretation of data and in writing the manuscript.
Publisher Copyright:
© 2020 The Author(s).
PY - 2020/5/29
Y1 - 2020/5/29
N2 - Background: Reverse engineering of transcriptional regulatory networks (TRN) from genomics data has always represented a computational challenge in System Biology. The major issue is modeling the complex crosstalk among transcription factors (TFs) and their target genes, with a method able to handle both the high number of interacting variables and the noise in the available heterogeneous experimental sources of information. Results: In this work, we propose a data fusion approach that exploits the integration of complementary omics-data as prior knowledge within a Bayesian framework, in order to learn and model large-scale transcriptional networks. We develop a hybrid structure-learning algorithm able to jointly combine TFs ChIP-Sequencing data and gene expression compendia to reconstruct TRNs in a genome-wide perspective. Applying our method to high-throughput data, we verified its ability to deal with the complexity of a genomic TRN, providing a snapshot of the synergistic TFs regulatory activity. Given the noisy nature of data-driven prior knowledge, which potentially contains incorrect information, we also tested the method's robustness to false priors on a benchmark dataset, comparing the proposed approach to other regulatory network reconstruction algorithms. We demonstrated the effectiveness of our framework by evaluating structural commonalities of our learned genomic network with other existing networks inferred by different DNA binding information-based methods. Conclusions: This Bayesian omics-data fusion based methodology allows to gain a genome-wide picture of the transcriptional interplay, helping to unravel key hierarchical transcriptional interactions, which could be subsequently investigated, and it represents a promising learning approach suitable for multi-layered genomic data integration, given its robustness to noisy sources and its tailored framework for handling high dimensional data.
AB - Background: Reverse engineering of transcriptional regulatory networks (TRN) from genomics data has always represented a computational challenge in System Biology. The major issue is modeling the complex crosstalk among transcription factors (TFs) and their target genes, with a method able to handle both the high number of interacting variables and the noise in the available heterogeneous experimental sources of information. Results: In this work, we propose a data fusion approach that exploits the integration of complementary omics-data as prior knowledge within a Bayesian framework, in order to learn and model large-scale transcriptional networks. We develop a hybrid structure-learning algorithm able to jointly combine TFs ChIP-Sequencing data and gene expression compendia to reconstruct TRNs in a genome-wide perspective. Applying our method to high-throughput data, we verified its ability to deal with the complexity of a genomic TRN, providing a snapshot of the synergistic TFs regulatory activity. Given the noisy nature of data-driven prior knowledge, which potentially contains incorrect information, we also tested the method's robustness to false priors on a benchmark dataset, comparing the proposed approach to other regulatory network reconstruction algorithms. We demonstrated the effectiveness of our framework by evaluating structural commonalities of our learned genomic network with other existing networks inferred by different DNA binding information-based methods. Conclusions: This Bayesian omics-data fusion based methodology allows to gain a genome-wide picture of the transcriptional interplay, helping to unravel key hierarchical transcriptional interactions, which could be subsequently investigated, and it represents a promising learning approach suitable for multi-layered genomic data integration, given its robustness to noisy sources and its tailored framework for handling high dimensional data.
KW - Bayesian networks
KW - Genomic transcriptional networks
KW - Hybrid structure learning algorithm
KW - omics-data fusion
UR - http://www.scopus.com/inward/record.url?scp=85085639193&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85085639193&partnerID=8YFLogxK
U2 - 10.1186/s12859-020-3510-1
DO - 10.1186/s12859-020-3510-1
M3 - Article
C2 - 32471360
AN - SCOPUS:85085639193
VL - 21
JO - BMC Bioinformatics
JF - BMC Bioinformatics
SN - 1471-2105
IS - 1
M1 - 0203510
ER -