TY - JOUR
T1 - scVAE
T2 - variational auto-encoders for single-cell gene expression data
AU - Grønbech, Christopher Heje
AU - Vording, Maximillian Fornitz
AU - Timshel, Pascal
AU - Sønderby, Casper Kaae
AU - Pers, Tune H
AU - Winther, Ole
N1 - © The Author(s) (2020). Published by Oxford University Press. All rights reserved. For Permissions, please email: [email protected].
PY - 2020
Y1 - 2020
N2 - MOTIVATION: Models for analysing and making relevant biological inferences from massive amounts of complex single-cell transcriptomic data typically require several individual data-processing steps, each with their own set of hyperparameter choices. With deep generative models one can work directly with count data, make likelihood-based model comparison, learn a latent representation of the cells and capture more of the variability in different cell populations.RESULTS: We propose a novel method based on variational auto-encoders (VAEs) for analysis of single-cell RNA sequencing (scRNA-seq) data. It avoids data preprocessing by using raw count data as input and can robustly estimate the expected gene expression levels and a latent representation for each cell. We tested several count likelihood functions and a variant of the VAE that has a priori clustering in the latent space. We show for several scRNA-seq data sets that our method outperforms recently proposed scRNA-seq methods in clustering cells and that the resulting clusters reflect cell types.AVAILABILITY AND IMPLEMENTATION: Our method, called scVAE, is implemented in Python using the TensorFlow machine-learning library, and it is freely available at https://github.com/scvae/scvae.SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
AB - MOTIVATION: Models for analysing and making relevant biological inferences from massive amounts of complex single-cell transcriptomic data typically require several individual data-processing steps, each with their own set of hyperparameter choices. With deep generative models one can work directly with count data, make likelihood-based model comparison, learn a latent representation of the cells and capture more of the variability in different cell populations.RESULTS: We propose a novel method based on variational auto-encoders (VAEs) for analysis of single-cell RNA sequencing (scRNA-seq) data. It avoids data preprocessing by using raw count data as input and can robustly estimate the expected gene expression levels and a latent representation for each cell. We tested several count likelihood functions and a variant of the VAE that has a priori clustering in the latent space. We show for several scRNA-seq data sets that our method outperforms recently proposed scRNA-seq methods in clustering cells and that the resulting clusters reflect cell types.AVAILABILITY AND IMPLEMENTATION: Our method, called scVAE, is implemented in Python using the TensorFlow machine-learning library, and it is freely available at https://github.com/scvae/scvae.SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
U2 - 10.1093/bioinformatics/btaa293
DO - 10.1093/bioinformatics/btaa293
M3 - Journal article
C2 - 32415966
VL - 36
SP - 4415
EP - 4422
JO - Computer Applications in the Biosciences
JF - Computer Applications in the Biosciences
SN - 1471-2105
IS - 16
ER -