stereo.core.StPipeline.highly_variable_genes#
- StPipeline.highly_variable_genes(groups=None, method='seurat', n_top_genes=2000, min_disp=0.5, max_disp=inf, min_mean=0.0125, max_mean=3, span=0.3, n_bins=20, res_key='highly_variable_genes')[source]#
Annotate highly variable genes, refering to Scanpy. Which method to implement depends on
flavor
,including Seurat [Satija15] , Cell Ranger [Zheng17] and Seurat v3 [Stuart19].- Parameters:
groups (
Optional
[str
]) – if specified, highly variable genes are selected within each batch separately and merged, which simply avoids the selection of batch-specific genes and acts as a lightweight batch correction method. For all flavors, genes are first sorted by how many batches they are a HVG. For dispersion-based flavors ties are broken by normalized dispersion. Ifflavor
is'seurat_v3'
, ties are broken by the median (across batches) rank based on within- batch normalized variance.method (
Literal
['seurat'
,'cell_ranger'
,'seurat_v3'
]) – Choose the flavor to identify highly variable genes. For the dispersion-based methods in their default workflows, Seurat passes the cutoffs whereas Cell Ranger passesn_top_genes
.n_top_genes (
Optional
[int
]) – number of highly variable genes to keep. Mandatory ifflavor='seurat_v3'
.min_disp (
Optional
[float
]) – ifn_top_genes
is not None, this and all other cutoffs for the means and the normalized dispersions are ignored. Ignored ifflavor='seurat_v3'
.max_disp (
Optional
[float
]) – ifn_top_genes
is not None, this and all other cutoffs for the means and the normalized dispersions are ignored. Ignored ifflavor='seurat_v3'
.min_mean (
Optional
[float
]) – ifn_top_genes
is not None, this and all other cutoffs for the means and the normalized dispersions are ignored. Ignored ifflavor='seurat_v3'
.max_mean (
Optional
[float
]) – ifn_top_genes
is not None, this and all other cutoffs for the means and the normalized dispersions are ignored. Ignored ifflavor='seurat_v3'
.span (
Optional
[float
]) – the fraction of data (cells) used when estimating the variance in the Loess model fit ifflavor='seurat_v3'
.n_bins (
int
) – number of bins for binning the mean gene expression. Normalization is done with respect to each bin. If just a single gene falls into a bin, the normalized dispersion is artificially set to 1.res_key – the key for getting the result from
self.result
.
- Return type:
An object of StereoExpData with the result of highly variable genes.