Very often for the same scientific question, there may exist different techniques or experiments that measure the same numerical quantity. Histori-cally, various methods have been developed to exploit the information within each type of data independently. However, statistical data fusion methods that could effectively integrate multisource data under a unified framework are lacking. In this paper we propose a novel data fusion method, called B-scaling, for integrating multisource data. Consider K measurements that are generated from different sources but measure the same latent variable through some linear or nonlinear ways. We seek to find a representation of the latent variable, named B-mean, which captures the common information contained in the K measurements while taking into account the nonlinear mappings between them and the latent variable. We also establish the asymptotic prop-erty of the B-mean and apply the proposed method to integrate multiple hi-stone modifications and DNA methylation levels for characterizing epige-nomic landscape. Both numerical and empirical studies show that B-scaling is a powerful data fusion method with broad applications.
- Data fusion
- generalized eigenvalue problem
- multisource data
ASJC Scopus subject areas
- Statistics and Probability
- Modeling and Simulation
- Statistics, Probability and Uncertainty