Abstract
Differential analysis of bulk RNA-seq data often suffers from lack of good controls. Here, we present a generative model that replaces controls, trained solely on healthy tissues. The unsupervised model learns a low-dimensional representation and can identify the closest normal representation for a given disease sample. This enables control-free, single-sample differential expression analysis. In breast cancer, we demonstrate how our approach selects marker genes and outperforms a state-of-the-art method. Furthermore, significant genes identified by the model are enriched in driver genes across cancers. Our results show that the in silico closest normal provides a more favorable comparison than control samples.
Original language | English |
---|---|
Article number | 263 |
Journal | Genome Biology |
Volume | 24 |
Issue number | 1 |
Number of pages | 17 |
ISSN | 1474-7596 |
DOIs | |
Publication status | Published - 2023 |
Bibliographical note
Publisher Copyright:© 2023, The Author(s).
Keywords
- Deep generative models
- Deep learning
- DEG
- DEseq2
- Differential expression analysis
- Transcriptomics