stereo.algorithm.cell_cell_communication.CellCellCommunication.main#

CellCellCommunication.main(analysis_type='statistical', cluster_res_key='cluster', micro_envs=None, species='HUMAN', database='cellphonedb', homogene_path=None, counts_identifiers='hgnc_symbol', subsampling=False, subsampling_log=False, subsampling_num_pc=100, subsampling_num_cells=None, pca_res_key=None, separator_cluster='|', separator_interaction='_', iterations=500, threshold=0.1, processes=1, pvalue=0.05, result_precision=3, output_path=None, means_filename='means', pvalues_filename='pvalues', significant_means_filename='significant_means', deconvoluted_filename='deconvoluted', output_format='csv', res_key='cell_cell_communication')[source]#

Cell-cell communication analysis main functon.

Parameters:

analysis_type (str) – type of analysis: “simple”, “statistical”.
cluster_res_key (str) – the key which specifies the clustering result in data.tl.result.
micro_envs (Union[DataFrame, str, None]) – a datafram or a string: if a datafram, it has two columns, column names should be “cell_type” and “microenvironment”. if a string, it is a key which specifies the gen_ccc_micro_envs result in data.tl.result.
species (str) – ‘HUMAN’ or ‘MOUSE’
database (str) – if species is HUMAN, choose from ‘cellphonedb’ or ‘liana’; if MOUSE, use ‘cellphonedb’ or ‘liana’ or ‘celltalkdb’; you can also specify the path of a database.
homogene_path (Optional[str]) – path to the file storing mouse-human homologous genes ralations. if species is MOUSE but database is ‘cellphonedb’ or ‘liana’, we need to use the human homologous genes for the input mouse genes.
counts_identifiers (str) – type of gene identifiers in the Counts data: “ensembl”, “gene_name” or “hgnc_symbol”.
subsampling (bool) – flag of subsampling.
subsampling_log (bool) – flag of doing log1p transformation before subsampling.
subsampling_num_pc (int) – number of pcs used when doing subsampling, <= min(m,n).
subsampling_num_cells (Optional[int]) – size of the subsample.
pca_res_key (Optional[str]) – the key which specifies the pca result in data.tl.result if set subsampling to True and set it to None, this function will run the pca.
separator_cluster (str) – separator of cluster names used in the result and plots, e.g. ‘|’.
separator_interaction (str) – separator of interactions used in the result and plots, e.g. ‘_’.
iterations (int) – number of iterations for the ‘statistical’ analysis type.
threshold (float) – threshold of percentage of gene expression, above which being considered as significant.
processes (int) – number of processes used for doing the statistical analysis, on notebook just only support one process. # noqa
pvalue (float) – the cut-point of p-value, below which being considered significant.
result_precision (int) – result precision for the results, default=3.
output_path (Optional[str]) – the path of directory to save the result files, set it to output the result to files.
means_filename (str) – name of the means result file.
pvalues_filename (str) – name of the pvalues result file.
significant_means_filename (str) – name of the significant mean result file.
deconvoluted_filename (str) – name of the deconvoluted result file.
output_format (str) – format of result, ‘txt’, ‘csv’, ‘tsv’, ‘tab’.
res_key (str) – set a key to store the result to data.tl.result.

Returns: