scvi-tools

scvi-tools - Deep Generative Modeling Framework for Single-Cell Omics

Overview of Capabilities

scvi-tools is a Python framework built on PyTorch and PyTorch Lightning, designed specifically for single-cell genomics. It provides deep generative models and variational inference methods for analyzing multiple single-cell data modalities.

Use Cases

1. Batch Correction and Integration of Single-Cell Sequencing Data

When you need to integrate single-cell data from different batches, labs, or platforms, the probabilistic batch-correction methods provided by scvi-tools can effectively remove technical noise while preserving biological variation. This is suitable for combining and jointly analyzing datasets from multiple experiments.

2. Joint Analysis of Multimodal Single-Cell Data

When you have multimodal data such as CITE-seq or multiome, scvi-tools models like totalVI and MultiVI can jointly model proteins and RNA, or paired/unpaired multi-omics datasets, enabling more comprehensive biological discoveries.

3. Advanced Statistical Inference with Uncertainty

If you need to perform differential expression analysis, cell-type annotation, or RNA velocity analysis and want probabilistic measures of uncertainty, scvi-tools’ Bayesian inference–based methods can provide more rigorous statistical conclusions than traditional approaches.

Core Features

1. Deep Generative Models with a Unified API

scvi-tools offers over 20 pretrained models covering single-cell RNA-seq (scVI, scANVI), ATAC-seq (PeakVI, PoissonVI), multimodal integration (totalVI, MultiVI), and spatial transcriptomics (DestVI, Stereoscope). All models follow a consistent API pattern: register data → train model → extract results, and they integrate seamlessly with scanpy.

2. Probabilistic Batch Correction and Data Integration

Using a variational autoencoder (VAE) architecture, scvi-tools learns latent representations of the data and automatically separates technical variation (e.g., batch effects, donor differences) from biological variation. It supports registering categorical and continuous covariates during the setup_anndata stage, allowing flexible modeling of known technical factors.

3. Uncertainty-Aware Differential Expression Analysis

Unlike traditional frequentist DE methods, scvi-tools provides probabilistic differential expression analysis based on composite hypothesis testing, estimating both effect sizes and statistical uncertainty, and supporting minimum effect-size thresholds to yield more reliable biological conclusions.

Frequently Asked Questions

What is the difference between scvi-tools and scanpy? Which should I choose?

scanpy is a standard workflow tool for single-cell analysis, suitable for routine preprocessing, visualization, and basic analyses. scvi-tools focuses on advanced statistical modeling and deep learning methods, ideal for scenarios requiring batch correction, multimodal integration, or uncertainty-quantified analyses. Many users combine them: use scanpy for preprocessing, scvi-tools for advanced modeling, and scanpy again for downstream visualization.

How do I perform batch correction with scVI?

First ensure you use raw count data (not log-normalized), then call scvi.model.SCVI.setup_anndata() to register batch information, create and train the model, and finally use get_latent_representation() to obtain the batch-corrected latent representation. The whole process can be done in a few lines of code, and the model will automatically learn to remove batch effects.

Does scvi-tools support GPUs?

Yes. scvi-tools is built on PyTorch Lightning and can automatically detect and utilize available GPUs to accelerate training. For large datasets (tens to hundreds of thousands of cells), enabling a GPU can significantly reduce training time. You can install GPU support with pip install scvi-tools[cuda].

Author

Category

Install