stereo.algorithm.gen_ccc_micro_envs.GenCccMicroEnvs.main¶
- GenCccMicroEnvs.main(cluster_res_key='cluster', n_boot=20, boot_prop=0.8, dimension=3, fill_rare=True, min_num=30, binsize=2, eps=1e-20, show_dividing_by_thresholds=True, method='split', threshold=None, output_path=None, res_key='ccc_micro_envs')[source]¶
Generate the micro-environment used for the CCC analysis.
This function should be ran twice because it includes two parts:
Calculating how the diffrent clusters are divided into diffrent micro environments under diffrent thresholds. You can choose an appropriate threshold based on the divided result. In order to run this part, you need to set the parameter
thresholdto None. The output is a dataframe like below:threshold
subgroup_result
0.44298617727504136
[{‘1’}, {‘2’}, {‘3’}]
0.625776310617184
[{‘1’, ‘2’}, {‘3’}]
The column
subgroup_resultis a list of sets, each set contains some groups and represents a micro-environment.Generating the micro environments by setting an appropriate
methodandthresholdbased on the result of first part. On this part, all the parameters beforemethodare ignored. The output is a dataframe like below:cell_type
microenviroment
NKcells_1
microenv_0
NKcells_0
microenv_0
Tcells
microenv_1
Myeloid
microenv_2
- Parameters:
cluster_res_key (
str) – the key which specifies the clustering result in data.tl.result.n_boot (
int) – number of bootstrap samples, default = 100.boot_prop (
float) – proportion of each bootstrap sample, default = 0.8.dimension (
int) – 2 or 3.fill_rare (
bool) – bool, whether simulate cells for rare cell type when calculating kde.min_num (
int) – if a cell type has cells < min_num, it is considered rare.binsize (
float) – grid size used for kde, it is used for gridding the space. For example, a sample from square chip is gridded into mesh grids that have 100 intersections(determined by the given binsize), For each cell type, fit the KDE according to the coordinates of all cells of this type and calculate KDE values of the 100 intersections. Then KL divergence between each pair of cell types is calculated based on the calculated KDE values, which is then used to construct the microenvironments.eps (
float) – fill eps to zero kde to avoid inf KL divergence.show_dividing_by_thresholds (
bool) – whether to display the result while running the first part of this function.method (
str) – define micro environments using two methods: 1) minimum spanning tree, or 2) pruning the fully connected tree based on a given threshold of KL, then split the graph into multiple strongly connected component.threshold (
Optional[float]) – the threshold to divide micro environment. 1) set it to None to run the first part of this function. 1) set it to an appropriate value to run the second part.output_path (
Optional[str]) – the directory to save the result, if set it to None, the result is only stored in memory.res_key (
str) – set a key to store the result to data.tl.result, in second part, it must be set the same as first part.