Cell Community Detection#

Sequencing technology advances applied in spatial transcriptomics opened a new window into the tissue micro-environment. By capturing gene expression profiles of individual cells along with their spatial coordinates, researchers have gained the ability to accurately ascertain cell types and their respective functions. While some cell types are known to coexist within various regions of the tissue, the co-occurrence of many others remains elusive.

The proposed Cell Community Detection (CCD) algorithm addresses this challenge by offering a novel computational approach for identifying tissue domains with a significant mixture of particular cell types. The CCD algorithm divides the tissue using sliding windows, quantifies the proportion of each cell type within each window, and groups together windows with similar cell type mixtures. By employing majority voting, the algorithm assigns a community label to each cell based on the labels of windows covering it. Notably, CCD accommodates multiple window sizes and enables the simultaneous analysis of multiple samples from the same tissue [Fang23]. Its Python implementation with a flexible user interface enables the processing of datasets with tens of thousands of cells in sub-minute execution time.

Multi-size sliding window approach#

CCD divides the tissue using sliding windows by accommodating multiple window sizes, and enables the simultaneous analysis of multiple samples from the same tissue. CCD consists of the three main steps:

  • Single or multiple-size sliding windows (\(w\)) are moved through the surface of the tissue with defined horizontal and vertical step while calculating the percentages (\([p_1, p_2,...,p_n]\)) of each cell type inside of it. A feature vector (\(fv\)) with size equal to the number of cell types (\(n\)) is created for each processed window across all available tissue samples:

\[\begin{equation} \forall w_i\rightarrow (fv_i = [p_1, p_2,...,p_n]) \end{equation}\]
  • Feature vectors from all windows are fed to the clustering algorithm (\(C\)) such as Leiden, Spectral or Hierarchical to obtain community labels (\(l\)). The number of the desired communities (\(cn\)) can be predefined explicitly as a parameter (Spectral or Hierarchical clustering) or by setting the resolution of clustering (Leiden):

\[\begin{equation} C(\forall fv_i) \rightarrow l_i, l_i \in {l_1, l_2, ..., l_{cn}} \end{equation}\]
  • Community label is assigned to each cell-spot (\(cs\)) by majority voting (\(MV\)) using community labels from all windows covering it:

\[\begin{equation} MV(\forall l_i)\text{ where } spatial(cs_j) \in w_i \rightarrow l_j, l_j \in {l_1, l_2, ..., l_{cn}} \end{equation}\]

The window size and sliding step are optional CCD parameters. If not provided, the optimal window size is calculated throughout the iterative process with goal of having average number of cell-spots in all windows in range [30, 50]. Sliding step is set to the half of the window size.

Data preparation#

CCD is a part of the algorithm module of Stereopy. It expects a sample or a list of samples as input, depending on single or multiple sample processing. CCD could process samples from all types of spatial transcriptomics technologies.

In order to be processed by CCD, an data object (each of them) requires: - cell spatial coordinates, - cell type annotation, - cell type color palette.

When you work with AnnData H5ad file, spatial coordinates should be placed in .obsm[‘spatial’], cell annotation labels in .obs[‘’] and color palette for cell types in .uns[’_colors’].

Download a sample of demo data, Stereo-seq Mouse Embryo Whole brain. More in example data.

[1]:
import stereo as st
from stereo.core.stereo_exp_data import AnnBasedStereoExpData
from stereo.algorithm.community_detection import CommunityDetection
from stereo.core.ms_data import MSData

data = st.io.read_h5ad('../data/E16.5_E1S3_cell_bin_whole_brain.h5ad')
[2]:
# MSData
ms_data =MSData()

ms_data += st.io.read_h5ad('../data/Embyro_anndata075/Embyro_E9.5.h5ad')
ms_data += st.io.read_h5ad('../data/Embyro_anndata075/Embyro_E10.5.h5ad')

Run CCD#

In the current example CCD is run for single sample of mouse embryo whole brain Stereo-seq sample. Window size is set to 150, and sliding step to 50. This provides, on average, 30 cells per window. By employing scatteredness threshold and downsampling rate cell types that are spread throughout the tissue, providing no localization or community perspective, are removed. Hierarchical agglomerative clustering algorithm with ward linkage is chosen with predefined number of 16 clusters.

You can get further information about parameters from the API documents.

[3]:
# use a 'ccd' object to obtain the set of analysis results
ccd = data.tl.community_detection(
            annotation='sim anno',
            out_path='results/whole_brain',
            win_sizes='150',
            sliding_steps='50',
            scatter_thres=0.12,
            downsample_rate=80,
            cluster_algo='agglomerative',
            n_clusters=16,
            resolution=0.25,
            plotting=5,
            hide_plots=True
            )
[2023-09-11 11:39:31][Stereo][26988][MainThread][139824720418624][st_pipeline][71][INFO]: register algorithm community_detection to <stereo.core.st_pipeline.AnnBasedStPipeline object at 0x7f2b2a3dac40>
[2023-09-11 11:39:31][Stereo][26988][MainThread][139824720418624][community_detection][328][INFO]: Window size info for slice: Slice_0
                     window size: 150
                     sliding step: 50
                     cells mean: 21.75
                     cells median: 24.0
                     num horizontal windows: 94
                     num vertical windows: 66


[2023-09-11 11:39:32][Stereo][26988][MainThread][139824720418624][utils][27][INFO]: Function plot_spatial took 0.3214s
[2023-09-11 11:39:32][Stereo][26988][MainThread][139824720418624][utils][27][INFO]: Function plot_annotation took 0.9834s
[2023-09-11 11:39:38][Stereo][26988][MainThread][139824720418624][utils][27][INFO]: Function calc_feature_matrix took 6.0911s
[2023-09-11 11:39:39][Stereo][26988][MainThread][139824720418624][utils][27][INFO]: Function plot_celltype_images took 0.0555s
[2023-09-11 11:39:51][Stereo][26988][MainThread][139824720418624][utils][27][INFO]: Function cluster took 12.1677s
Trying to set attribute `.obs` of view, copying.
[2023-09-11 11:39:54][Stereo][26988][MainThread][139824720418624][sliding_window][200][INFO]: Sliding window cell mixture calculation done. Added results to adata.obs["tissue_sliding_window"]
[2023-09-11 11:39:54][Stereo][26988][MainThread][139824720418624][utils][27][INFO]: Function community_calling took 3.2643s
[2023-09-11 11:39:55][Stereo][26988][MainThread][139824720418624][community_clustering_algorithm][702][INFO]: Saved community labels after clustering as a part of original anndata file to Slice_0.csv
... storing 'agglomerative' as categorical
[2023-09-11 11:39:55][Stereo][26988][MainThread][139824720418624][community_clustering_algorithm][682][INFO]: Saved clustering result tissue_Slice_0.h5ad.
[2023-09-11 11:39:55][Stereo][26988][MainThread][139824720418624][utils][27][INFO]: Function plot_spatial took 0.1625s
[2023-09-11 11:39:55][Stereo][26988][MainThread][139824720418624][utils][27][INFO]: Function plot_clustering took 0.6408s
[2023-09-11 11:39:55][Stereo][26988][MainThread][139824720418624][utils][27][INFO]: Function calculate_cell_mixture_stats took 0.0303s
[2023-09-11 11:40:11][Stereo][26988][MainThread][139824720418624][utils][27][INFO]: Function plot_stats took 15.2718s
[2023-09-11 11:40:17][Stereo][26988][MainThread][139824720418624][utils][27][INFO]: Function plot_celltype_table took 6.5410s
[2023-09-11 11:40:17][Stereo][26988][MainThread][139824720418624][utils][27][INFO]: Function plot_spatial took 0.2230s
[2023-09-11 11:40:18][Stereo][26988][MainThread][139824720418624][utils][27][INFO]: Function plot_spatial took 0.1572s
[2023-09-11 11:40:18][Stereo][26988][MainThread][139824720418624][utils][27][INFO]: Function plot_spatial took 0.2200s
[2023-09-11 11:40:19][Stereo][26988][MainThread][139824720418624][utils][27][INFO]: Function plot_spatial took 0.1642s
[2023-09-11 11:40:20][Stereo][26988][MainThread][139824720418624][utils][27][INFO]: Function plot_spatial took 0.2230s
[2023-09-11 11:40:20][Stereo][26988][MainThread][139824720418624][utils][27][INFO]: Function plot_spatial took 0.1626s
[2023-09-11 11:40:21][Stereo][26988][MainThread][139824720418624][utils][27][INFO]: Function plot_spatial took 0.2293s
[2023-09-11 11:40:21][Stereo][26988][MainThread][139824720418624][utils][27][INFO]: Function plot_spatial took 0.1604s
[2023-09-11 11:40:22][Stereo][26988][MainThread][139824720418624][utils][27][INFO]: Function plot_spatial took 0.2235s
[2023-09-11 11:40:22][Stereo][26988][MainThread][139824720418624][utils][27][INFO]: Function plot_spatial took 0.1567s
[2023-09-11 11:40:23][Stereo][26988][MainThread][139824720418624][utils][27][INFO]: Function plot_spatial took 0.2554s
[2023-09-11 11:40:23][Stereo][26988][MainThread][139824720418624][utils][27][INFO]: Function plot_spatial took 0.1611s
[2023-09-11 11:40:24][Stereo][26988][MainThread][139824720418624][utils][27][INFO]: Function plot_spatial took 0.2319s
[2023-09-11 11:40:24][Stereo][26988][MainThread][139824720418624][utils][27][INFO]: Function plot_spatial took 0.1606s
[2023-09-11 11:40:25][Stereo][26988][MainThread][139824720418624][utils][27][INFO]: Function plot_spatial took 0.2221s
[2023-09-11 11:40:25][Stereo][26988][MainThread][139824720418624][utils][27][INFO]: Function plot_spatial took 0.1572s
[2023-09-11 11:40:26][Stereo][26988][MainThread][139824720418624][utils][27][INFO]: Function plot_spatial took 0.2243s
[2023-09-11 11:40:26][Stereo][26988][MainThread][139824720418624][utils][27][INFO]: Function plot_spatial took 0.1581s
[2023-09-11 11:40:27][Stereo][26988][MainThread][139824720418624][utils][27][INFO]: Function plot_spatial took 0.2195s
[2023-09-11 11:40:28][Stereo][26988][MainThread][139824720418624][utils][27][INFO]: Function plot_spatial took 0.1574s
[2023-09-11 11:40:28][Stereo][26988][MainThread][139824720418624][utils][27][INFO]: Function plot_spatial took 0.2213s
[2023-09-11 11:40:29][Stereo][26988][MainThread][139824720418624][utils][27][INFO]: Function plot_spatial took 0.1565s
[2023-09-11 11:40:30][Stereo][26988][MainThread][139824720418624][utils][27][INFO]: Function plot_spatial took 0.2241s
[2023-09-11 11:40:30][Stereo][26988][MainThread][139824720418624][utils][27][INFO]: Function plot_spatial took 0.1547s
[2023-09-11 11:40:31][Stereo][26988][MainThread][139824720418624][utils][27][INFO]: Function plot_spatial took 0.2193s
[2023-09-11 11:40:31][Stereo][26988][MainThread][139824720418624][utils][27][INFO]: Function plot_spatial took 0.1887s
[2023-09-11 11:40:32][Stereo][26988][MainThread][139824720418624][utils][27][INFO]: Function plot_spatial took 0.2218s
[2023-09-11 11:40:32][Stereo][26988][MainThread][139824720418624][utils][27][INFO]: Function plot_spatial took 0.1593s
[2023-09-11 11:40:33][Stereo][26988][MainThread][139824720418624][utils][27][INFO]: Function plot_spatial took 0.2231s
[2023-09-11 11:40:33][Stereo][26988][MainThread][139824720418624][utils][27][INFO]: Function plot_spatial took 0.1568s
[2023-09-11 11:40:34][Stereo][26988][MainThread][139824720418624][utils][27][INFO]: Function plot_spatial took 0.2254s
[2023-09-11 11:40:34][Stereo][26988][MainThread][139824720418624][utils][27][INFO]: Function plot_spatial took 0.1595s
[2023-09-11 11:40:35][Stereo][26988][MainThread][139824720418624][utils][27][INFO]: Function plot_cluster_mixtures took 17.4919s
[2023-09-11 11:40:41][Stereo][26988][MainThread][139824720418624][utils][27][INFO]: Function boxplot_stats took 6.6147s
[2023-09-11 11:45:04][Stereo][26988][MainThread][139824720418624][utils][27][INFO]: Function colorplot_stats took 262.6519s
[2023-09-11 11:56:04][Stereo][26988][MainThread][139824720418624][utils][27][INFO]: Function colorplot_stats_per_cell_types took 659.9349s
[2023-09-11 11:56:04][Stereo][26988][MainThread][139824720418624][community_clustering_algorithm][682][INFO]: Saved clustering result tissue_Slice_0_stats.h5ad.
[2023-09-11 11:56:20][Stereo][26988][MainThread][139824720418624][utils][27][INFO]: Function plot_celltype_mixtures_total took 16.4239s
[2023-09-11 11:56:21][Stereo][26988][MainThread][139824720418624][utils][27][INFO]: Function plot_cell_abundance_total took 0.4104s
[2023-09-11 11:56:21][Stereo][26988][MainThread][139824720418624][utils][27][INFO]: Function plot_cluster_abundance_total took 0.2607s
[2023-09-11 11:56:21][Stereo][26988][MainThread][139824720418624][utils][27][INFO]: Function plot_cell_abundance_per_slice took 0.3789s
[2023-09-11 11:56:22][Stereo][26988][MainThread][139824720418624][utils][27][INFO]: Function plot_cluster_abundance_per_slice took 0.2349s
[2023-09-11 11:56:22][Stereo][26988][MainThread][139824720418624][utils][27][INFO]: Function plot_cell_perc_in_community_per_slice took 0.1941s
[2023-09-11 11:56:22][Stereo][26988][MainThread][139824720418624][utils][27][INFO]: Function generate_report took 0.0677s
[2023-09-11 11:56:22][Stereo][26988][MainThread][139824720418624][utils][27][INFO]: Function _main took 1010.8949s
<Figure size 640x480 with 0 Axes>
<Figure size 640x480 with 0 Axes>
<Figure size 640x480 with 0 Axes>
<Figure size 640x480 with 0 Axes>

Visualizations#

CCD suports a wide range of visualizations, created in order to provide the analyst with deep insight into data. Visualizations cover basic annotation and CCD cluster plots, as well as tables containing cell mixtures data and cell type abundances per cluster and type. Additionally, per community plots are supported for analysis and comparison in case/control scenarios.

By Clustering#

Cell type annotation:

[4]:
# show the distribution of cell types
ccd.plot('all_annotations')
[2023-09-11 11:56:22][Stereo][26988][MainThread][139824720418624][utils][27][INFO]: Function plot_spatial took 0.2290s
../_images/Tutorials_Cell_Community_Detection_16_1.png
[2023-09-11 11:56:23][Stereo][26988][MainThread][139824720418624][utils][27][INFO]: Function plot_all_annotation took 1.5304s

Plot community clusters, also called functional modules:

[5]:
# plot comunities (functional modules)
ccd.plot('all_clustering')
[2023-09-11 11:56:24][Stereo][26988][MainThread][139824720418624][utils][27][INFO]: Function plot_spatial took 0.1573s
../_images/Tutorials_Cell_Community_Detection_18_1.png
[2023-09-11 11:56:25][Stereo][26988][MainThread][139824720418624][utils][27][INFO]: Function plot_all_clustering took 1.0631s

Bar plot of cell communities abundance in tissue sample.

[6]:
# show percentages of presence of communities detected in the sample
ccd.plot('cluster_abundance_total')
../_images/Tutorials_Cell_Community_Detection_20_0.png
[2023-09-11 11:56:25][Stereo][26988][MainThread][139824720418624][utils][27][INFO]: Function plot_cluster_abundance_total took 0.4393s
[7]:
ms_ccd = ms_data.tl.ms_community_detection(
    annotation='annotation',
    scatter_thres=0.2,
    cluster_algo='agglomerative',
    n_clusters=25,
    out_path="./results",
    plotting=5,
    hide_plots=True
    )
[2023-09-11 11:56:25][Stereo][26988][MainThread][139824720418624][community_detection][75][INFO]: Window sizes and/or sliding steps not provided by user - proceeding to calculate optimal values
[2023-09-11 11:56:25][Stereo][26988][MainThread][139824720418624][community_detection][328][INFO]: Window size info for slice: Slice_0
                     window size: 12
                     sliding step: 6
                     cells mean: 55.78
                     cells median: 64.0
                     num horizontal windows: 9
                     num vertical windows: 12


[2023-09-11 11:56:25][Stereo][26988][MainThread][139824720418624][utils][27][INFO]: Function calc_optimal_win_size_and_slide_step took 0.0211s
[2023-09-11 11:56:25][Stereo][26988][MainThread][139824720418624][community_detection][82][INFO]: Downsample rate is not provided by user - proceeding to calculate one based on minimal window size.
[2023-09-11 11:56:25][Stereo][26988][MainThread][139824720418624][community_detection][85][INFO]: donwsample_rate = 6
[2023-09-11 11:56:25][Stereo][26988][MainThread][139824720418624][utils][27][INFO]: Function plot_spatial took 0.0521s
[2023-09-11 11:56:25][Stereo][26988][MainThread][139824720418624][utils][27][INFO]: Function plot_annotation took 0.2862s
[2023-09-11 11:56:25][Stereo][26988][MainThread][139824720418624][utils][27][INFO]: Function calc_feature_matrix took 0.1645s
[2023-09-11 11:56:26][Stereo][26988][MainThread][139824720418624][utils][27][INFO]: Function plot_celltype_images took 0.0074s
[2023-09-11 11:56:26][Stereo][26988][MainThread][139824720418624][utils][27][INFO]: Function plot_spatial took 0.0801s
[2023-09-11 11:56:26][Stereo][26988][MainThread][139824720418624][utils][27][INFO]: Function plot_annotation took 0.3472s
[2023-09-11 11:56:26][Stereo][26988][MainThread][139824720418624][utils][27][INFO]: Function calc_feature_matrix took 0.4360s
[2023-09-11 11:56:27][Stereo][26988][MainThread][139824720418624][utils][27][INFO]: Function plot_celltype_images took 0.0075s
[2023-09-11 11:56:27][Stereo][26988][MainThread][139824720418624][utils][27][INFO]: Function plot_spatial took 0.0499s
[2023-09-11 11:56:27][Stereo][26988][MainThread][139824720418624][utils][27][INFO]: Function plot_spatial took 0.0793s
[2023-09-11 11:56:27][Stereo][26988][MainThread][139824720418624][utils][27][INFO]: Function plot_all_annotation took 0.6214s
[2023-09-11 11:56:27][Stereo][26988][MainThread][139824720418624][utils][27][INFO]: Function cluster took 0.0137s
Trying to set attribute `.obs` of view, copying.
[2023-09-11 11:56:27][Stereo][26988][MainThread][139824720418624][sliding_window][200][INFO]: Sliding window cell mixture calculation done. Added results to adata.obs["tissue_sliding_window"]
[2023-09-11 11:56:27][Stereo][26988][MainThread][139824720418624][utils][27][INFO]: Function community_calling took 0.0444s
[2023-09-11 11:56:27][Stereo][26988][MainThread][139824720418624][community_clustering_algorithm][702][INFO]: Saved community labels after clustering as a part of original anndata file to Slice_0.csv
... storing 'agglomerative' as categorical
[2023-09-11 11:56:27][Stereo][26988][MainThread][139824720418624][community_clustering_algorithm][682][INFO]: Saved clustering result tissue_Slice_0.h5ad.
[2023-09-11 11:56:27][Stereo][26988][MainThread][139824720418624][utils][27][INFO]: Function plot_spatial took 0.0543s
[2023-09-11 11:56:28][Stereo][26988][MainThread][139824720418624][utils][27][INFO]: Function plot_clustering took 0.3472s
[2023-09-11 11:56:28][Stereo][26988][MainThread][139824720418624][utils][27][INFO]: Function calculate_cell_mixture_stats took 0.0110s
[2023-09-11 11:56:33][Stereo][26988][MainThread][139824720418624][utils][27][INFO]: Function plot_stats took 5.1719s
[2023-09-11 11:56:35][Stereo][26988][MainThread][139824720418624][utils][27][INFO]: Function plot_celltype_table took 2.4165s
[2023-09-11 11:56:35][Stereo][26988][MainThread][139824720418624][utils][27][INFO]: Function plot_spatial took 0.0542s
[2023-09-11 11:56:35][Stereo][26988][MainThread][139824720418624][utils][27][INFO]: Function plot_spatial took 0.0567s
[2023-09-11 11:56:36][Stereo][26988][MainThread][139824720418624][utils][27][INFO]: Function plot_spatial took 0.0496s
[2023-09-11 11:56:36][Stereo][26988][MainThread][139824720418624][utils][27][INFO]: Function plot_spatial took 0.0562s
[2023-09-11 11:56:36][Stereo][26988][MainThread][139824720418624][utils][27][INFO]: Function plot_spatial took 0.0488s
[2023-09-11 11:56:36][Stereo][26988][MainThread][139824720418624][utils][27][INFO]: Function plot_spatial took 0.0557s
[2023-09-11 11:56:37][Stereo][26988][MainThread][139824720418624][utils][27][INFO]: Function plot_spatial took 0.0490s
[2023-09-11 11:56:37][Stereo][26988][MainThread][139824720418624][utils][27][INFO]: Function plot_spatial took 0.0572s
[2023-09-11 11:56:37][Stereo][26988][MainThread][139824720418624][utils][27][INFO]: Function plot_spatial took 0.0491s
[2023-09-11 11:56:37][Stereo][26988][MainThread][139824720418624][utils][27][INFO]: Function plot_spatial took 0.0573s
[2023-09-11 11:56:37][Stereo][26988][MainThread][139824720418624][utils][27][INFO]: Function plot_spatial took 0.0503s
[2023-09-11 11:56:38][Stereo][26988][MainThread][139824720418624][utils][27][INFO]: Function plot_spatial took 0.0566s
[2023-09-11 11:56:38][Stereo][26988][MainThread][139824720418624][utils][27][INFO]: Function plot_spatial took 0.0519s
[2023-09-11 11:56:38][Stereo][26988][MainThread][139824720418624][utils][27][INFO]: Function plot_spatial took 0.0565s
[2023-09-11 11:56:38][Stereo][26988][MainThread][139824720418624][utils][27][INFO]: Function plot_spatial took 0.0494s
[2023-09-11 11:56:38][Stereo][26988][MainThread][139824720418624][utils][27][INFO]: Function plot_spatial took 0.0564s
[2023-09-11 11:56:39][Stereo][26988][MainThread][139824720418624][utils][27][INFO]: Function plot_spatial took 0.0483s
[2023-09-11 11:56:39][Stereo][26988][MainThread][139824720418624][utils][27][INFO]: Function plot_spatial took 0.0545s
[2023-09-11 11:56:39][Stereo][26988][MainThread][139824720418624][utils][27][INFO]: Function plot_spatial took 0.0484s
[2023-09-11 11:56:39][Stereo][26988][MainThread][139824720418624][utils][27][INFO]: Function plot_spatial took 0.0556s
[2023-09-11 11:56:40][Stereo][26988][MainThread][139824720418624][utils][27][INFO]: Function plot_spatial took 0.0483s
[2023-09-11 11:56:40][Stereo][26988][MainThread][139824720418624][utils][27][INFO]: Function plot_spatial took 0.0560s
[2023-09-11 11:56:40][Stereo][26988][MainThread][139824720418624][utils][27][INFO]: Function plot_cluster_mixtures took 4.7355s
[2023-09-11 11:56:44][Stereo][26988][MainThread][139824720418624][utils][27][INFO]: Function boxplot_stats took 3.5188s
[2023-09-11 11:56:45][Stereo][26988][MainThread][139824720418624][utils][27][INFO]: Function colorplot_stats took 0.9907s
[2023-09-11 11:56:46][Stereo][26988][MainThread][139824720418624][utils][27][INFO]: Function colorplot_stats_per_cell_types took 1.4285s
[2023-09-11 11:56:46][Stereo][26988][MainThread][139824720418624][community_clustering_algorithm][682][INFO]: Saved clustering result tissue_Slice_0_stats.h5ad.
Trying to set attribute `.obs` of view, copying.
[2023-09-11 11:56:46][Stereo][26988][MainThread][139824720418624][sliding_window][200][INFO]: Sliding window cell mixture calculation done. Added results to adata.obs["tissue_sliding_window"]
[2023-09-11 11:56:46][Stereo][26988][MainThread][139824720418624][utils][27][INFO]: Function community_calling took 0.0740s
[2023-09-11 11:56:46][Stereo][26988][MainThread][139824720418624][community_clustering_algorithm][702][INFO]: Saved community labels after clustering as a part of original anndata file to Slice_1.csv
... storing 'agglomerative' as categorical
[2023-09-11 11:56:46][Stereo][26988][MainThread][139824720418624][community_clustering_algorithm][682][INFO]: Saved clustering result tissue_Slice_1.h5ad.
[2023-09-11 11:56:46][Stereo][26988][MainThread][139824720418624][utils][27][INFO]: Function plot_spatial took 0.0812s
[2023-09-11 11:56:46][Stereo][26988][MainThread][139824720418624][utils][27][INFO]: Function plot_clustering took 0.4374s
[2023-09-11 11:56:47][Stereo][26988][MainThread][139824720418624][utils][27][INFO]: Function calculate_cell_mixture_stats took 0.0144s
[2023-09-11 11:56:50][Stereo][26988][MainThread][139824720418624][utils][27][INFO]: Function plot_stats took 3.5847s
[2023-09-11 11:56:55][Stereo][26988][MainThread][139824720418624][utils][27][INFO]: Function plot_celltype_table took 4.9560s
[2023-09-11 11:56:55][Stereo][26988][MainThread][139824720418624][utils][27][INFO]: Function plot_spatial took 0.0813s
[2023-09-11 11:56:55][Stereo][26988][MainThread][139824720418624][utils][27][INFO]: Function plot_spatial took 0.0780s
[2023-09-11 11:56:56][Stereo][26988][MainThread][139824720418624][utils][27][INFO]: Function plot_spatial took 0.0793s
[2023-09-11 11:56:56][Stereo][26988][MainThread][139824720418624][utils][27][INFO]: Function plot_spatial took 0.0807s
[2023-09-11 11:56:56][Stereo][26988][MainThread][139824720418624][utils][27][INFO]: Function plot_spatial took 0.0784s
[2023-09-11 11:56:56][Stereo][26988][MainThread][139824720418624][utils][27][INFO]: Function plot_spatial took 0.0786s
[2023-09-11 11:56:57][Stereo][26988][MainThread][139824720418624][utils][27][INFO]: Function plot_spatial took 0.0781s
[2023-09-11 11:56:57][Stereo][26988][MainThread][139824720418624][utils][27][INFO]: Function plot_spatial took 0.0796s
[2023-09-11 11:56:57][Stereo][26988][MainThread][139824720418624][utils][27][INFO]: Function plot_spatial took 0.0795s
[2023-09-11 11:56:58][Stereo][26988][MainThread][139824720418624][utils][27][INFO]: Function plot_spatial took 0.0790s
[2023-09-11 11:56:58][Stereo][26988][MainThread][139824720418624][utils][27][INFO]: Function plot_spatial took 0.0797s
[2023-09-11 11:56:58][Stereo][26988][MainThread][139824720418624][utils][27][INFO]: Function plot_spatial took 0.0787s
[2023-09-11 11:56:59][Stereo][26988][MainThread][139824720418624][utils][27][INFO]: Function plot_spatial took 0.0796s
[2023-09-11 11:56:59][Stereo][26988][MainThread][139824720418624][utils][27][INFO]: Function plot_spatial took 0.0796s
[2023-09-11 11:56:59][Stereo][26988][MainThread][139824720418624][utils][27][INFO]: Function plot_spatial took 0.0812s
[2023-09-11 11:56:59][Stereo][26988][MainThread][139824720418624][utils][27][INFO]: Function plot_spatial took 0.0790s
[2023-09-11 11:57:00][Stereo][26988][MainThread][139824720418624][utils][27][INFO]: Function plot_spatial took 0.0807s
[2023-09-11 11:57:00][Stereo][26988][MainThread][139824720418624][utils][27][INFO]: Function plot_spatial took 0.6117s
[2023-09-11 11:57:01][Stereo][26988][MainThread][139824720418624][utils][27][INFO]: Function plot_spatial took 0.0797s
[2023-09-11 11:57:01][Stereo][26988][MainThread][139824720418624][utils][27][INFO]: Function plot_spatial took 0.0809s
[2023-09-11 11:57:01][Stereo][26988][MainThread][139824720418624][utils][27][INFO]: Function plot_spatial took 0.0796s
[2023-09-11 11:57:01][Stereo][26988][MainThread][139824720418624][utils][27][INFO]: Function plot_spatial took 0.0816s
[2023-09-11 11:57:02][Stereo][26988][MainThread][139824720418624][utils][27][INFO]: Function plot_spatial took 0.0789s
[2023-09-11 11:57:02][Stereo][26988][MainThread][139824720418624][utils][27][INFO]: Function plot_spatial took 0.0791s
[2023-09-11 11:57:02][Stereo][26988][MainThread][139824720418624][utils][27][INFO]: Function plot_spatial took 0.0820s
[2023-09-11 11:57:03][Stereo][26988][MainThread][139824720418624][utils][27][INFO]: Function plot_spatial took 0.0791s
[2023-09-11 11:57:03][Stereo][26988][MainThread][139824720418624][utils][27][INFO]: Function plot_spatial took 0.0806s
[2023-09-11 11:57:03][Stereo][26988][MainThread][139824720418624][utils][27][INFO]: Function plot_spatial took 0.0794s
[2023-09-11 11:57:04][Stereo][26988][MainThread][139824720418624][utils][27][INFO]: Function plot_spatial took 0.0805s
[2023-09-11 11:57:04][Stereo][26988][MainThread][139824720418624][utils][27][INFO]: Function plot_spatial took 0.0803s
[2023-09-11 11:57:04][Stereo][26988][MainThread][139824720418624][utils][27][INFO]: Function plot_spatial took 0.0790s
[2023-09-11 11:57:04][Stereo][26988][MainThread][139824720418624][utils][27][INFO]: Function plot_spatial took 0.0795s
[2023-09-11 11:57:05][Stereo][26988][MainThread][139824720418624][utils][27][INFO]: Function plot_spatial took 0.0815s
[2023-09-11 11:57:05][Stereo][26988][MainThread][139824720418624][utils][27][INFO]: Function plot_spatial took 0.0807s
[2023-09-11 11:57:05][Stereo][26988][MainThread][139824720418624][utils][27][INFO]: Function plot_spatial took 0.0786s
[2023-09-11 11:57:05][Stereo][26988][MainThread][139824720418624][utils][27][INFO]: Function plot_spatial took 0.0798s
[2023-09-11 11:57:06][Stereo][26988][MainThread][139824720418624][utils][27][INFO]: Function plot_cluster_mixtures took 10.7486s
[2023-09-11 11:57:09][Stereo][26988][MainThread][139824720418624][utils][27][INFO]: Function boxplot_stats took 3.2859s
[2023-09-11 11:57:11][Stereo][26988][MainThread][139824720418624][utils][27][INFO]: Function colorplot_stats took 2.2807s
[2023-09-11 11:57:13][Stereo][26988][MainThread][139824720418624][utils][27][INFO]: Function colorplot_stats_per_cell_types took 1.1329s
[2023-09-11 11:57:13][Stereo][26988][MainThread][139824720418624][community_clustering_algorithm][682][INFO]: Saved clustering result tissue_Slice_1_stats.h5ad.
[2023-09-11 11:57:13][Stereo][26988][MainThread][139824720418624][utils][27][INFO]: Function plot_spatial took 0.0571s
[2023-09-11 11:57:13][Stereo][26988][MainThread][139824720418624][utils][27][INFO]: Function plot_spatial took 0.0791s
[2023-09-11 11:57:13][Stereo][26988][MainThread][139824720418624][utils][27][INFO]: Function plot_all_clustering took 0.7157s
[2023-09-11 11:57:20][Stereo][26988][MainThread][139824720418624][utils][27][INFO]: Function plot_celltype_mixtures_total took 6.3615s
[2023-09-11 11:57:20][Stereo][26988][MainThread][139824720418624][utils][27][INFO]: Function plot_cell_abundance_total took 0.3149s
[2023-09-11 11:57:20][Stereo][26988][MainThread][139824720418624][utils][27][INFO]: Function plot_cluster_abundance_total took 0.3294s
[2023-09-11 11:57:21][Stereo][26988][MainThread][139824720418624][utils][27][INFO]: Function plot_cell_abundance_per_slice took 0.4119s
[2023-09-11 11:57:21][Stereo][26988][MainThread][139824720418624][utils][27][INFO]: Function plot_cluster_abundance_per_slice took 0.4547s
[2023-09-11 11:57:21][Stereo][26988][MainThread][139824720418624][utils][27][INFO]: Function plot_cell_perc_in_community_per_slice took 0.3292s
[2023-09-11 11:57:22][Stereo][26988][MainThread][139824720418624][utils][27][INFO]: Function generate_report took 0.0702s
[2023-09-11 11:57:22][Stereo][26988][MainThread][139824720418624][utils][27][INFO]: Function _main took 56.5488s
<Figure size 640x480 with 0 Axes>
<Figure size 640x480 with 0 Axes>
<Figure size 640x480 with 0 Axes>
<Figure size 640x480 with 0 Axes>
<Figure size 640x480 with 0 Axes>
<Figure size 640x480 with 0 Axes>
<Figure size 640x480 with 0 Axes>
<Figure size 640x480 with 0 Axes>

Table of cell type abundances per cluster and type (colors are matched with cluster and cell type annotation colors).

This table allows analyst to detect the infuence of cell types on different communities, especially when cell types have small number of cells in the sample.

[8]:
ms_ccd.plot('cell_types_table')
../_images/Tutorials_Cell_Community_Detection_23_0.png
[2023-09-11 11:57:24][Stereo][26988][MainThread][139824720418624][utils][27][INFO]: Function plot_celltype_table took 2.8694s
<Figure size 640x480 with 0 Axes>

Per community#

Display of cell type mixtures for specified community. Provides insight on spatial influence and cell type abundance.

[36]:
# Plot cell types in community 8 and 3
ms_ccd.plot('cluster_mixtures', slice_id=0, community_id=8)
ms_ccd.plot('cluster_mixtures', slice_id=0, community_id=3)
[2023-09-11 12:27:08][Stereo][26988][MainThread][139824720418624][utils][27][INFO]: Function plot_spatial took 0.0532s
[2023-09-11 12:27:08][Stereo][26988][MainThread][139824720418624][utils][27][INFO]: Function plot_spatial took 0.0589s
../_images/Tutorials_Cell_Community_Detection_26_1.png
[2023-09-11 12:27:09][Stereo][26988][MainThread][139824720418624][utils][27][INFO]: Function plot_cluster_mixtures took 0.7751s
[2023-09-11 12:27:09][Stereo][26988][MainThread][139824720418624][utils][27][INFO]: Function plot_spatial took 0.0526s
[2023-09-11 12:27:09][Stereo][26988][MainThread][139824720418624][utils][27][INFO]: Function plot_spatial took 0.0589s
<Figure size 640x480 with 0 Axes>
../_images/Tutorials_Cell_Community_Detection_26_4.png
[2023-09-11 12:27:10][Stereo][26988][MainThread][139824720418624][utils][27][INFO]: Function plot_cluster_mixtures took 0.7915s
<Figure size 640x480 with 0 Axes>

Boxplot of cell type percentages in a community per each window that belongs to the community.

This plot provides information on community uniformity and smoothness. Significant variance in cell type percentages suggest possible merge of several communities and need for increase of the clustering resolution (as shown for community 3).

[42]:
# plot percentage of presence of cell types in windows belonging to community 8 and 3
ms_ccd.plot('boxplot', slice_id=0, community_id=8)
ms_ccd.plot('boxplot', slice_id=0, community_id=3)
../_images/Tutorials_Cell_Community_Detection_28_0.png
[2023-09-11 12:32:16][Stereo][26988][MainThread][139824720418624][utils][27][INFO]: Function boxplot_stats took 0.3099s
<Figure size 640x480 with 0 Axes>
../_images/Tutorials_Cell_Community_Detection_28_3.png
[2023-09-11 12:32:16][Stereo][26988][MainThread][139824720418624][utils][27][INFO]: Function boxplot_stats took 0.6032s
<Figure size 640x480 with 0 Axes>

Colorplot of cell percentages in a community per each window that belongs to the community. The percentages of top three cell types in the community for each window are used as R,G and B values and plotted over the tissue image.

This plot provides visual spatial information on community uniformity and smoothness. Significant difference in cell type percentages provides different colors and shows the possible existence of several communities and need for increase of the clustering resolution (as shown for community 3).

[43]:
# plot for community 8 and 3 of windows colored using percentages of top 3 cell types mapped into R, G and B color channels
ms_ccd.plot('colorplot', slice_id=0, community_id=8)
ms_ccd.plot('colorplot', slice_id=0, community_id=3)
../_images/Tutorials_Cell_Community_Detection_30_0.png
[2023-09-11 12:33:26][Stereo][26988][MainThread][139824720418624][utils][27][INFO]: Function colorplot_stats took 0.1925s
<Figure size 640x480 with 0 Axes>
../_images/Tutorials_Cell_Community_Detection_30_3.png
[2023-09-11 12:33:26][Stereo][26988][MainThread][139824720418624][utils][27][INFO]: Function colorplot_stats took 0.1781s
<Figure size 640x480 with 0 Axes>

Additional options#

CCD also includes several filtering steps controlled with parameters, such as removal of cell types present in all parts of the tissue and removal of windows with too small number of cell-spots: - Spatial distribution of each cell type can be evaluated using 2D entropy and scatteredness metrics. CCD supports setting the threshold values for these metrics in order to exclude cell types which are randomly or evenly spread throughout the tissue from processing. Removing cell types with high entropy and scatteredness improves clustering and provides more robust cell communities. - The robustness and quality of CCD strongly depends on clustering. In order for clustering to be stable, feature vectors need to contain significant amount of information, that is, enough cell-spots in each evaluated window. CCD gathers data on total cell numbers per window and supports setting a threshold value for minimum cell-spot number for the window to be included in the clustering process. Cell-spots are marked with ‘unknown’ label if there are no cell community labeled windows that overlap them.

Final results and visualisations produced by CCD are aggregated into the structured HTML report enabling researches to get the full insight into the obtained communities and their statistics.

Running CCD in a shell#

After installing stereopy, you can also run CCD in a shell by command ccd like below:

ccd --input data/sample1.h5ad data/sample2.h5ad --annotation=annotation

`--input` is used to receive a h5ad file, space separated list of h5ad files or path of directory contains some h5ad files and `--annotation` specifies the key getting the cell type from `obs`.

Running ccd --help to get further information about other arguments.