Cell Co-occurrence#

Cell co-occurrence refers to the occurrence, spatial relationship, of two or more cell types in close proximity in the same tissue region. The algorithm aims to identify and quantify the interactions or associations between different cell types in order to understand their spatial organization and potential functional relationships [Fang23].

Clustering on cells#

Download example data, and complete basic analysis processing.

[1]:
import stereo as st
from stereo.core.ms_data import MSData
from stereo.core.ms_pipeline import slice_generator
[2]:
data = st.io.read_h5ad('../data/Puck_191204_22.h5ad')
data.position = data._ann_data.obsm['X_spatial']

Show the distribution of clusters.

[3]:
data.plt.cluster_scatter(res_key='author_cell_type')
[3]:
../_images/Tutorials_Cell_Co_Occurrence_7_2.png

Co-occurrence analysis#

Here provides two method for calculate co-occurence, 'squidpy' for method in Squidpy, 'stereopy' for method in Stereopy by default.

[4]:
data.tl.co_occurrence(
                method='stereopy',
                cluster_res_key='author_cell_type',
                res_key='co-occurrence',
                dist_thres=180, # max threshold to measure co-occurence
                steps=6,  # step numbers to divide threshold interval evenly
                genelist=None,
                gene_thresh=0, # min threshold for gene expression in a cell
                n_jobs=-1
                )
[2023-11-13 16:33:59][Stereo][177753][MainThread][139984851941184][st_pipeline][77][INFO]: register algorithm co_occurrence to <stereo.core.st_pipeline.AnnBasedStPipeline object at 0x7f507957bdc0>
[4]:
AnnData object with n_obs × n_vars = 31600 × 21319
    obs: 'assay_ontology_term_id', 'self_reported_ethnicity_ontology_term_id', 'is_primary_data', 'organism_ontology_term_id', 'sample', 'tissue_ontology_term_id', 'disease_state', 'sex_ontology_term_id', 'genotype', 'development_stage_ontology_term_id', 'author_cell_type', 'cell_type_ontology_term_id', 'disease_ontology_term_id', 'donor_id', 'suspension_type', 'cell_type', 'assay', 'disease', 'organism', 'sex', 'tissue', 'self_reported_ethnicity', 'development_stage'
    var: 'feature_is_filtered', 'feature_name', 'feature_reference', 'feature_biotype'
    uns: 'schema_version', 'title', 'sn', 'co-occurrence'
    obsm: 'X_spatial', 'spatial'

Each plot represents, along with the increase of distance, the probability of co-occurrence between the cluster in title and other clusters.

Note

Each mark on x-axis represents a distance interval but not a single value. For example, 50 means the interval from 0 to 50 while 100 is from 50 to 100 and so on.

[5]:
data.plt.co_occurrence_plot(res_key='co-occurrence')
[2023-11-13 16:34:38][Stereo][177753][MainThread][139984851941184][plot_collection][82][INFO]: register plot_func co_occurrence_plot to <stereo.plots.plot_collection.PlotCollection object at 0x7f507a578cd0>
[5]:
../_images/Tutorials_Cell_Co_Occurrence_13_3.png

Each heatmap represents, under different distance interval in title, the probability of co-occurrence between each clusters.

Note

Similar to the plots above, the values in title represent a distance interval, for example, 30 is the interval from 0 to 30 while 60 is from 30 to 60 and so on.

[6]:
data.plt.co_occurrence_heatmap(res_key='co-occurrence', cluster_res_key='author_cell_type')
[2023-11-13 16:34:43][Stereo][177753][MainThread][139984851941184][plot_collection][82][INFO]: register plot_func co_occurrence_heatmap to <stereo.plots.plot_collection.PlotCollection object at 0x7f507a578cd0>
[6]:
../_images/Tutorials_Cell_Co_Occurrence_16_3.png

Co-occurrence with a gene list#

Stereopy’s co-occurrence also implement a method to calculate co-occurrence between genes and celltype by regrading cells of which genes express higher than gene_thresh (gene threshold) as a cluster. Here we use the marker genes of each cell types for instance.

[7]:
gene_list = ['ENSMUSG00000030945', 'ENSMUSG00000035783', 'ENSMUSG00000023013',
             'ENSMUSG00000043144', 'ENSMUSG00000026394', 'ENSMUSG00000022748',
             'ENSMUSG00000028017', 'ENSMUSG00000062515', 'ENSMUSG00000021765',
             'ENSMUSG00000018339', 'ENSMUSG00000073421', 'ENSMUSG00000060586',
             'ENSMUSG00000032758', 'ENSMUSG00000035202', 'ENSMUSG00000069516',
             'ENSMUSG00000092341', 'ENSMUSG00000025608', 'ENSMUSG00000070645',
             'ENSMUSG00000026678', 'ENSMUSG00000040808', 'ENSMUSG00000027202',
             'ENSMUSG00000031766', 'ENSMUSG00000049775', 'ENSMUSG00000030963']

data.tl.co_occurrence(
                method='stereopy',
                cluster_res_key='author_cell_type',
                res_key='co-occurrence_gene',
                dist_thres=180, # max threshold to measure co-occurence
                steps=6,  # step numbers to divide threshold interval evenly
                genelist=gene_list,
                gene_thresh=2, # min threshold for gene expression in a cell
                n_jobs=-1
                )
[7]:
AnnData object with n_obs × n_vars = 31600 × 21319
    obs: 'assay_ontology_term_id', 'self_reported_ethnicity_ontology_term_id', 'is_primary_data', 'organism_ontology_term_id', 'sample', 'tissue_ontology_term_id', 'disease_state', 'sex_ontology_term_id', 'genotype', 'development_stage_ontology_term_id', 'author_cell_type', 'cell_type_ontology_term_id', 'disease_ontology_term_id', 'donor_id', 'suspension_type', 'cell_type', 'assay', 'disease', 'organism', 'sex', 'tissue', 'self_reported_ethnicity', 'development_stage'
    var: 'feature_is_filtered', 'feature_name', 'feature_reference', 'feature_biotype'
    uns: 'schema_version', 'title', 'sn', 'co-occurrence', 'co-occurrence_gene'
    obsm: 'X_spatial', 'spatial'
[8]:
data.plt.co_occurrence_plot(res_key='co-occurrence_gene')
[8]:
../_images/Tutorials_Cell_Co_Occurrence_19_2.png

On multi-sample#

Here also provide a metrics to analysis multi-sample co-occurrence. The metrics include integrate and compare, which is controled by a parameter called scope. Integrate analysis will be conducted when scope is seperated by ‘,’ like scope = 'WT,BTBR' and otherwise by ‘|’ like scope = 'WT|BTBR'. In this demo, we will show both two.

Load the other one from BTBR#

[9]:
data_btbr = st.io.read_h5ad('../data/Puck_191204_15.h5ad')
data_btbr.position = data_btbr._ann_data.obsm['X_spatial']
[10]:
data_btbr.tl.co_occurrence(
                method='stereopy',
                cluster_res_key='author_cell_type',
                res_key='co-occurrence',
                dist_thres=180, # max threshold to measure co-occurence
                steps=6,  # step numbers to divide threshold interval evenly
                genelist=None,
                gene_thresh=0, # min threshold for gene expression in a cell
                n_jobs=-1
                )
[2023-11-13 16:56:24][Stereo][177753][MainThread][139984851941184][st_pipeline][77][INFO]: register algorithm co_occurrence to <stereo.core.st_pipeline.AnnBasedStPipeline object at 0x7f4f15281520>
[10]:
AnnData object with n_obs × n_vars = 35132 × 21239
    obs: 'assay_ontology_term_id', 'self_reported_ethnicity_ontology_term_id', 'is_primary_data', 'organism_ontology_term_id', 'sample', 'tissue_ontology_term_id', 'disease_state', 'sex_ontology_term_id', 'genotype', 'development_stage_ontology_term_id', 'author_cell_type', 'cell_type_ontology_term_id', 'disease_ontology_term_id', 'donor_id', 'suspension_type', 'cell_type', 'assay', 'disease', 'organism', 'sex', 'tissue', 'self_reported_ethnicity', 'development_stage'
    var: 'feature_is_filtered', 'feature_name', 'feature_reference', 'feature_biotype'
    uns: 'schema_version', 'title', 'sn', 'co-occurrence'
    obsm: 'X_spatial', 'spatial'

MSData construction#

[11]:
ms_data = MSData()
ms_data['WT'] = data
ms_data['BTBR'] = data_btbr
ms_data
[11]:
ms_data: {'WT': (31600, 21319), 'BTBR': (35132, 21239)}
num_slice: 2
names: ['WT', 'BTBR']
obs: []
var: []
relationship: other
var_type: intersect to 0
mss: []
[12]:
ms_data.integrate()
ms_data.to_integrate(res_key = 'author_cell_type', scope = slice_generator[:], _from=slice_generator[:], type='obs', item=['author_cell_type']*ms_data.num_slice)

Integrate metrics#

[13]:
from stereo.algorithm.co_occurrence import CoOccurrence
[15]:
CoOccurrence.ms_co_occur_integrate(
    ms_data=ms_data,
    scope='WT,BTBR',
    use_col='author_cell_type',
    res_key='co-occurrence'
)
[16]:
ms_data.plt.co_occurrence_heatmap(res_key='co-occurrence', cluster_res_key='author_cell_type')
[2023-11-13 17:03:17][Stereo][177753][MainThread][139984851941184][ms_pipeline][128][INFO]: register plot_func co_occurrence_heatmap to <class 'stereo.core.stereo_exp_data.StereoExpData'>-139977635259632
[16]:
../_images/Tutorials_Cell_Co_Occurrence_30_3.png

Compare metrics#

[17]:
CoOccurrence.ms_co_occur_integrate(
    ms_data=ms_data,
    scope='WT|BTBR',
    use_col='author_cell_type',
    res_key='co-occurrence'
)
[19]:
ms_data.plt.co_occurrence_heatmap(res_key='co-occurrence', cluster_res_key='author_cell_type')
[2023-11-13 17:03:32][Stereo][177753][MainThread][139984851941184][ms_pipeline][128][INFO]: register plot_func co_occurrence_heatmap to <class 'stereo.core.stereo_exp_data.StereoExpData'>-139977635259632
[19]:
../_images/Tutorials_Cell_Co_Occurrence_33_3.png