stereo.core.StPipeline.pca#

StPipeline.pca(use_highly_genes=False, n_pcs=None, svd_solver='auto', hvg_res_key='highly_variable_genes', res_key='pca')[source]#

Principal component analysis.

Parameters:
  • use_highly_genes (bool) – whether to use the expression of hypervariable genes only.

  • n_pcs (Optional[int]) – the number of principle components to compute.

  • svd_solver (Literal['auto', 'full', 'arpack', 'randomized']) –

    default to 'auto'.

    • If 'auto' :

      The solver is selected by a default policy based on X.shape and n_pcs: if the input data is larger than 500x500 and the number of components to extract is lower than 80% of the smallest dimension of the data, then the more efficient ‘randomized’ method is enabled. Otherwise the exact full SVD is computed and optionally truncated afterwards.

    • If 'full' :

      run exact full SVD calling the standard LAPACK solver via scipy.linalg.svd and select the components by postprocessing

    • If 'arpack' :

      run SVD truncated to n_pcs calling ARPACK solver via scipy.sparse.linalg.svds. It requires strictly 0 < n_pcs < min(x.shape)

    • If 'randomized' :

      run randomized SVD.

  • hvg_res_key (Optional[str]) – the key of highly variable genes to get targeted result,`use_highly_genes=True` is a necessary prerequisite.

  • res_key (str) – the key for storage of PCA result.

Returns:

Computation result of principal component analysis is stored in self.result where the result key is 'pca'.