stereo.algorithm.gen_ccc_micro_envs.GenCccMicroEnvs.main#
- GenCccMicroEnvs.main(cluster_res_key='cluster', n_boot=20, boot_prop=0.8, dimension=3, fill_rare=True, min_num=30, binsize=2, eps=1e-20, show_dividing_by_thresholds=True, method='split', threshold=None, output_path=None, res_key='ccc_micro_envs')[source]#
Generate the micro-environment used for the CCC analysis.
This function should be ran twice because it includes two parts: 1) Calculating how the diffrent clusters are divided into diffrent micro environments under diffrent thresholds.
You can choose a appropriate threshold based on the divided result. In order to run this part, you need to set the parameter
threshold
to None. The output is a dataframe which format like below:threshold subgroup_result 0.44298617727504136 [{‘1’}, {‘2’}, {‘3’}] 0.625776310617184 [{‘1’, ‘2’}, {‘3’}]
The column
subgroup_result
is a list of set in which each set which contains some clusters represents a micro-environment.Generating the micro environments by setting a appropriate
method
andthreshold
based on the result of first part. On this part, the parameters beformethod
are all ignored. The output is a dataframe which format like below:cell_type microenviroment NKcells_1 microenv_0 NKcells_0 microenv_0 Tcells microenv_1 Myeloid microenv_2
- Parameters:
cluster_res_key (
str
) – the key which specifies the clustering result in data.tl.result.n_boot (
int
) – number of bootstrap samples, default = 100.boot_prop (
float
) – proportion of each bootstrap sample, default = 0.8.dimension (
int
) – 2 or 3.fill_rare (
bool
) – bool, whether simulate cells for rare cell type when calculating kde.min_num (
int
) – if a cell type has cells < min_num, it is considered rare.binsize (
float
) – grid size used for kde.eps (
float
) – fill eps to zero kde to avoid inf KL divergence.show_dividing_by_thresholds (
bool
) – whether to display the result while running the first part of this function.method (
str
) – define micro environments using two methods: 1) minimum spanning tree, or 2) pruning the fully connected tree based on a given threshold of KL, then split the graph into multiple strongly connected component.threshold (
Optional
[float
]) – the threshold to divide micro environment. 1) set it to None to run the first part of this function. 1) set it to a appropriate value to run the second part.output_path (
Optional
[str
]) – the directory to save the result, if set it to None, the result is only stored in memory.res_key (
str
) – set a key to store the result to data.tl.result, in second part, it must be set the same as first part.