stereo.algorithm.cell_cell_communication.CellCellCommunication.main#

CellCellCommunication.main(analysis_type='statistical', cluster_res_key='cluster', micro_envs=None, species='HUMAN', database='cellphonedb', homogene_path=None, counts_identifiers='hgnc_symbol', subsampling=False, subsampling_log=False, subsampling_num_pc=100, subsampling_num_cells=None, pca_res_key=None, separator_cluster='|', separator_interaction='_', iterations=500, threshold=0.1, processes=1, pvalue=0.05, result_precision=3, output_path=None, means_filename='means', pvalues_filename='pvalues', significant_means_filename='significant_means', deconvoluted_filename='deconvoluted', output_format='csv', res_key='cell_cell_communication')[source]#

Cell-cell communication analysis main functon.

Parameters:
  • analysis_type (str) – type of analysis: “simple”, “statistical”.

  • cluster_res_key (str) – the key which specifies the clustering result in data.tl.result.

  • micro_envs (Union[DataFrame, str, None]) – a datafram or a string: if a datafram, it has two columns, column names should be “cell_type” and “microenvironment”. if a string, it is a key which specifies the gen_ccc_micro_envs result in data.tl.result.

  • species (str) – ‘HUMAN’ or ‘MOUSE’

  • database (str) – if species is HUMAN, choose from ‘cellphonedb’ or ‘liana’; if MOUSE, use ‘cellphonedb’ or ‘liana’ or ‘celltalkdb’; you can also specify the path of a database.

  • homogene_path (Optional[str]) – path to the file storing mouse-human homologous genes ralations. if species is MOUSE but database is ‘cellphonedb’ or ‘liana’, we need to use the human homologous genes for the input mouse genes.

  • counts_identifiers (str) – type of gene identifiers in the Counts data: “ensembl”, “gene_name” or “hgnc_symbol”.

  • subsampling (bool) – flag of subsampling.

  • subsampling_log (bool) – flag of doing log1p transformation before subsampling.

  • subsampling_num_pc (int) – number of pcs used when doing subsampling, <= min(m,n).

  • subsampling_num_cells (Optional[int]) – size of the subsample.

  • pca_res_key (Optional[str]) – the key which specifies the pca result in data.tl.result if set subsampling to True and set it to None, this function will run the pca.

  • separator_cluster (str) – separator of cluster names used in the result and plots, e.g. ‘|’.

  • separator_interaction (str) – separator of interactions used in the result and plots, e.g. ‘_’.

  • iterations (int) – number of iterations for the ‘statistical’ analysis type.

  • threshold (float) – threshold of percentage of gene expression, above which being considered as significant.

  • processes (int) – number of processes used for doing the statistical analysis, on notebook just only support one process. # noqa

  • pvalue (float) – the cut-point of p-value, below which being considered significant.

  • result_precision (int) – result precision for the results, default=3.

  • output_path (Optional[str]) – the path of directory to save the result files, set it to output the result to files.

  • means_filename (str) – name of the means result file.

  • pvalues_filename (str) – name of the pvalues result file.

  • significant_means_filename (str) – name of the significant mean result file.

  • deconvoluted_filename (str) – name of the deconvoluted result file.

  • output_format (str) – format of result, ‘txt’, ‘csv’, ‘tsv’, ‘tab’.

  • res_key (str) – set a key to store the result to data.tl.result.

Returns: