Abstract
A biclustering in the analysis of a gene expression data matrix, for example, is defined as a set of biclusters where each bicluster is a group of genes and a group of samples for which the genes are differentially expressed. Although many data mining approaches for biclustering exist in the literature, only few are able to incorporate prior knowledge to the analysis, which can lead to great improvements in terms of accuracy and interpretability, and all are limited in handling discrete data types. We propose a generalized biclustering approach that can be used for integrative analysis of multi-omics data with different data types. Our method is capable of utilizing biological information that can be represented by graph such as functional genomics and functional proteomics and accommodating a combination of continuous and discrete data types. The proposed method builds on a generalized Bayesian factor analysis framework and a variational EM approach is used to obtain parameter estimates, where the latent quantities in the loglikelihood are iteratively imputed by their conditional expectations. The biclusters are retrieved via the sparse estimates of the factor loadings and the conditional expectation of the latent factors. In order to obtain the sparse conditional expectation of the latent factors, a novel sparse variational EM algorithm is used. We demonstrate the superiority of our method over several existing biclustering methods in extensive simulation experiments and in integrative analysis of multi-omics data.
| Original language | English |
|---|---|
| Title of host publication | Proceedings - 10th IEEE International Conference on Big Knowledge, ICBK 2019 |
| Editors | Yunjun Gao, Ralf Moller, Xindong Wu, Ramamohanarao Kotagiri |
| Publisher | Institute of Electrical and Electronics Engineers Inc. |
| Pages | 25-32 |
| Number of pages | 8 |
| ISBN (Electronic) | 9781728146065 |
| DOIs | |
| State | Published - Nov 2019 |
| Event | 10th IEEE International Conference on Big Knowledge, ICBK 2019, Co-located with the 19th IEEE International Conference on Data Mining, ICDM 2019 - Beijing, China Duration: 10 Nov 2019 → 11 Nov 2019 |
Publication series
| Name | Proceedings - 10th IEEE International Conference on Big Knowledge, ICBK 2019 |
|---|
Conference
| Conference | 10th IEEE International Conference on Big Knowledge, ICBK 2019, Co-located with the 19th IEEE International Conference on Data Mining, ICDM 2019 |
|---|---|
| Country/Territory | China |
| City | Beijing |
| Period | 10/11/19 → 11/11/19 |
Bibliographical note
Publisher Copyright:© 2019 IEEE.
Keywords
- Bayesian latent factor model
- Biclustering
- Integrative multi-omics analysis
- Variational EM algorithm
Fingerprint
Dive into the research topics of 'Knowledge-guided biclustering via sparse variational em algorithm'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver