stereo.algorithm.gen_ccc_micro_envs.GenCccMicroEnvs.main#

GenCccMicroEnvs.main(cluster_res_key='cluster', n_boot=20, boot_prop=0.8, dimension=3, fill_rare=True, min_num=30, binsize=2, eps=1e-20, show_dividing_by_thresholds=True, method='split', threshold=None, output_path=None, res_key='ccc_micro_envs')[source]#

Generate the micro-environment used for the CCC analysis.

This function should be ran twice because it includes two parts: 1) Calculating how the diffrent clusters are divided into diffrent micro environments under diffrent thresholds.

You can choose a appropriate threshold based on the divided result. In order to run this part, you need to set the parameter threshold to None. The output is a dataframe which format like below:

threshold subgroup_result 0.44298617727504136 [{‘1’}, {‘2’}, {‘3’}] 0.625776310617184 [{‘1’, ‘2’}, {‘3’}]

The column subgroup_result is a list of set in which each set which contains some clusters represents a micro-environment.

  1. Generating the micro environments by setting a appropriate method and threshold based on the result of first part. On this part, the parameters befor method are all ignored. The output is a dataframe which format like below:

    cell_type microenviroment NKcells_1 microenv_0 NKcells_0 microenv_0 Tcells microenv_1 Myeloid microenv_2

Parameters:
  • cluster_res_key (str) – the key which specifies the clustering result in data.tl.result.

  • n_boot (int) – number of bootstrap samples, default = 100.

  • boot_prop (float) – proportion of each bootstrap sample, default = 0.8.

  • dimension (int) – 2 or 3.

  • fill_rare (bool) – bool, whether simulate cells for rare cell type when calculating kde.

  • min_num (int) – if a cell type has cells < min_num, it is considered rare.

  • binsize (float) – grid size used for kde.

  • eps (float) – fill eps to zero kde to avoid inf KL divergence.

  • show_dividing_by_thresholds (bool) – whether to display the result while running the first part of this function.

  • method (str) – define micro environments using two methods: 1) minimum spanning tree, or 2) pruning the fully connected tree based on a given threshold of KL, then split the graph into multiple strongly connected component.

  • threshold (Optional[float]) – the threshold to divide micro environment. 1) set it to None to run the first part of this function. 1) set it to a appropriate value to run the second part.

  • output_path (Optional[str]) – the directory to save the result, if set it to None, the result is only stored in memory.

  • res_key (str) – set a key to store the result to data.tl.result, in second part, it must be set the same as first part.