{ "cells": [ { "cell_type": "markdown", "id": "6c418c28", "metadata": {}, "source": [ "# Quick Start (Multi-sample)" ] }, { "cell_type": "markdown", "id": "84620c0f", "metadata": {}, "source": [ "## Multi samples" ] }, { "cell_type": "markdown", "id": "ba4f0d13", "metadata": {}, "source": [ "Multi-sample data set consists of continuous or time-series samples. This quick start would help you learn about handling them rapidly." ] }, { "cell_type": "markdown", "id": "e3551672", "metadata": {}, "source": [ "## MSData construction" ] }, { "cell_type": "markdown", "id": "e1667ddb", "metadata": {}, "source": [ "Please download the [example data](http://upload.dcs.cloud:8090/share/bb6fab82-2c16-46b2-a95e-6931338f31bf) first. Here we use drosophila data for demo. Main input file formats are GEM/GEF (from Stereo-seq), H5ad (from Scanpy). Add your dataset as below:" ] }, { "cell_type": "code", "execution_count": 2, "id": "be6d013a", "metadata": {}, "outputs": [], "source": [ "import sys\n", "import os\n", "from natsort import natsorted\n", "import stereo as st\n", "from stereo.core.ms_data import MSData\n", "from stereo.core.ms_pipeline import slice_generator\n", "\n", "import warnings\n", "warnings.filterwarnings('ignore')\n", "\n", "\n", "# prepara for input directory\n", "data_dir = './Demo_3D/3D_AnnData_0.8.0'\n", "\n", "data_list=[]\n", "for fn in os.listdir(data_dir):\n", " data_list.append(os.path.join(data_dir, fn))\n", "\n", "# ensure data order by naming them regularly\n", "data_list = natsorted(data_list)\n", "\n", "# construct MSData object \n", "ms_data = MSData(_relationship='other', _var_type='intersect')\n", "\n", "# when come to loaded data object \n", "# ms_data = MSData(_data_list=[data1, data2], _names=['s1', 's2'], _relationship='other', _var_type='intersect')\n", "\n", "# add all samples into MSData\n", "for sample in data_list:\n", " ms_data += st.io.read_h5ad(file_path=sample, bin_type='bins', bin_size=1)" ] }, { "cell_type": "markdown", "id": "e3d1557b", "metadata": {}, "source": [ "After loading sorted data into MSData object, just type it to obtain basic information." ] }, { "cell_type": "code", "execution_count": 3, "id": "52aac080", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "ms_data: {'0': (482, 13668), '1': (549, 13668), '2': (598, 13668), '3': (713, 13668), '4': (744, 13668), '5': (815, 13668), '6': (925, 13668), '7': (1272, 13668), '8': (1263, 13668), '9': (1248, 13668), '10': (1039, 13668), '11': (1260, 13668), '12': (959, 13668), '13': (1078, 13668), '14': (1240, 13668), '15': (1110, 13668)}\n", "num_slice: 16\n", "names: ['0', '1', '2', '3', '4', '5', '6', '7', '8', '9', '10', '11', '12', '13', '14', '15']\n", "obs: []\n", "var: []\n", "relationship: other\n", "var_type: intersect to 0\n", "mss: []" ] }, "execution_count": 3, "metadata": {}, "output_type": "execute_result" } ], "source": [ "ms_data" ] }, { "cell_type": "markdown", "id": "d87a32d0", "metadata": {}, "source": [ "Get one of samples like when you work with Python list." ] }, { "cell_type": "code", "execution_count": 4, "id": "6b13c5f7", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "AnnData object with n_obs × n_vars = 482 × 13668\n", " obs: 'slice_ID', 'raw_x', 'raw_y', 'new_x', 'new_y', 'new_z', 'annotation'\n", " uns: 'bin_type', 'bin_size', 'sn'\n", " obsm: 'X_umap', 'spatial', 'spatial_elas', 'spatial_rigid'\n", " layers: 'raw_counts'" ] }, "execution_count": 4, "metadata": {}, "output_type": "execute_result" } ], "source": [ "ms_data[0]" ] }, { "cell_type": "markdown", "id": "837895fd", "metadata": {}, "source": [ "
| \n", " | id | \n", "batch | \n", "total_counts | \n", "n_genes_by_counts | \n", "pct_counts_mt | \n", "
|---|---|---|---|---|---|
| E14-16h_a_S01_20500x62780-0-0 | \n", "CNS | \n", "0 | \n", "723.957825 | \n", "550 | \n", "0.0 | \n", "
| E14-16h_a_S01_20500x62800-0-0 | \n", "CNS | \n", "0 | \n", "722.087219 | \n", "573 | \n", "0.0 | \n", "
| E14-16h_a_S01_20500x62820-0-0 | \n", "epidermis | \n", "0 | \n", "768.790100 | \n", "666 | \n", "0.0 | \n", "
| E14-16h_a_S01_20500x62840-0-0 | \n", "CNS | \n", "0 | \n", "806.373718 | \n", "794 | \n", "0.0 | \n", "
| E14-16h_a_S01_20500x62860-0-0 | \n", "CNS | \n", "0 | \n", "818.378723 | \n", "815 | \n", "0.0 | \n", "
| ... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "
| E14-16h_a_S16_61760x79320-15-15 | \n", "NaN | \n", "15 | \n", "658.930420 | \n", "423 | \n", "0.0 | \n", "
| E14-16h_a_S16_61760x79340-15-15 | \n", "NaN | \n", "15 | \n", "643.020813 | \n", "410 | \n", "0.0 | \n", "
| E14-16h_a_S16_61760x79360-15-15 | \n", "NaN | \n", "15 | \n", "643.981323 | \n", "411 | \n", "0.0 | \n", "
| E14-16h_a_S16_61760x79380-15-15 | \n", "NaN | \n", "15 | \n", "613.082153 | \n", "379 | \n", "0.0 | \n", "
| E14-16h_a_S16_61760x79400-15-15 | \n", "NaN | \n", "15 | \n", "626.735046 | \n", "391 | \n", "0.0 | \n", "
15295 rows × 5 columns
\n", "