{ "cells": [ { "cell_type": "markdown", "id": "5ef5e661", "metadata": {}, "source": [ "# RNA Velocity\n", "\n", "As a powerful assessment of cell status, RNA velocity analyses expression dynamics based on scRNA-seq data, which generates the spliced and unspliced matrices in loom file, also called RNA abundance.\n" ] }, { "cell_type": "markdown", "id": "21a90ad5", "metadata": {}, "source": [ "For example, we use exon information of GEM/GEF file to generate the spliced and unspliced matrices in the loom file. Therefore, it is necessary to ensure that GEM/GEF file is obtained from `spatial_RNA_visualization_v5` in SAP or SAW (version >= 5.1.3). \n", "\n", "The annotation logic in SAW is calculating overlap area with exon. If the overlap area with exon information is greater than 50%, it is considered that the transcript belongs to exon." ] }, { "cell_type": "markdown", "id": "68108fdd", "metadata": {}, "source": [ "## Loom file" ] }, { "attachments": {}, "cell_type": "markdown", "id": "12985efe", "metadata": {}, "source": [ "Import the module to generate loom file, please download our [example data](http://upload.dcs.cloud:8090/share/bb6fab82-2c16-46b2-a95e-6931338f31bf) previously.\n", "\n", "If the gef file input by the user is for the SAW-ST-V8 process and an error occurs during direct execution, then add a monkey patch code first and then re-run." ] }, { "cell_type": "code", "execution_count": null, "id": "6eab77c9", "metadata": {}, "outputs": [], "source": [ "# gef file generate by SAW-ST-V8 process\n", "import pandas as pd\n", "import warnings\n", "\n", "original_read_csv=pd.read_csv\n", "\n", "def patched_read_csv(*args,**kwargs):\n", "\t# 移除废弃的参数\n", "\tkwargs.pop('error bad lines', None)\n", "\tkwargs.pop('warn bad lines', None)\n", " \n", "\tif 'on_bad_lines' not in kwargs:\n", "\t\tkwargs['on_bad_lines'] = 'warn'\n", "\treturn original_read_csv(*args,**kwargs)\n", "\n", "pd.read_csv= patched_read_csv" ] }, { "cell_type": "code", "execution_count": 5, "id": "a78bd43c", "metadata": { "ExecuteTime": { "end_time": "2023-04-03T03:13:56.421457Z", "start_time": "2023-04-03T02:56:33.244356Z" }, "scrolled": true }, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "[2023-04-03 10:56:33][Stereo][20511][MainThread][139671934703424][rna_velocity][66][INFO]: Getting layers\n", "[2023-04-03 10:58:22][Stereo][20511][MainThread][139671934703424][rna_velocity][77][INFO]: Getting row attrs from gtf\n", "INFO:root:Extracted GTF attributes: ['gene_id', 'gene_version', 'gene_name', 'gene_source', 'gene_biotype', 'transcript_id', 'transcript_version', 'transcript_name', 'transcript_source', 'transcript_biotype', 'transcript_support_level', 'exon_number', 'exon_id', 'exon_version', 'tag', 'ccds_id', 'protein_id', 'protein_version']\n", "[2023-04-03 11:09:37][Stereo][20511][MainThread][139671934703424][rna_velocity][80][INFO]: Generating loom\n" ] } ], "source": [ "from stereo.tools import generate_loom\n", "\n", "bgef_file = './SS200000135TL_D1.tissue.gef'\n", "gtf_file = './genes.gtf'\n", "out_dir = './SS200000135TL_D1_bgef'\n", "\n", "# generate loom file\n", "loom_data = generate_loom(\n", " gef_path=bgef_file, \n", " gtf_path=gtf_file, \n", " bin_type='bins', \n", " bin_size=100, \n", " out_dir=out_dir\n", " )" ] }, { "cell_type": "markdown", "id": "f7e1336a", "metadata": {}, "source": [ "