{ "cells": [ { "cell_type": "markdown", "id": "ebf84a5d", "metadata": {}, "source": [ "# Clustering by GPU" ] }, { "cell_type": "markdown", "id": "d1c73b52", "metadata": {}, "source": [ "After trying dozens of technologies, we find that GPU acceleration is the most efficient way to speed up clustering currently. " ] }, { "cell_type": "markdown", "id": "1fa4869d", "metadata": {}, "source": [ "## Requirements" ] }, { "cell_type": "markdown", "id": "81d80acc", "metadata": {}, "source": [ "### CUDA installation" ] }, { "cell_type": "markdown", "id": "553f4135", "metadata": {}, "source": [ " Linux" ] }, { "cell_type": "markdown", "id": "0de02191", "metadata": {}, "source": [ "Linux users follow the guide, [NVIDIA CUDA Installation Guide for Linux](https://docs.nvidia.com/cuda/cuda-installation-guide-linux/) to install CUDA." ] }, { "cell_type": "markdown", "id": "f2a19a6d", "metadata": {}, "source": [ " Windows" ] }, { "cell_type": "markdown", "id": "57fc490f", "metadata": {}, "source": [ "Installation of CUDA on Windows is a bit more complicated, because Stereopy is not supported on Windows now. Following the guide [CUDA on WSL User Guide](https://docs.nvidia.com/cuda/wsl-user-guide/index.html#getting-started-with-cuda-on-wsl), you can run Stereopy with GPU option on WSL." ] }, { "cell_type": "markdown", "id": "a2b6957b", "metadata": {}, "source": [ "### RAPIDS on Anaconda" ] }, { "cell_type": "markdown", "id": "6c54f9b2", "metadata": {}, "source": [ "Select the correct version on the homepage of [RAPIDS' official website](https://rapids.ai/start.html). Run following command to build up a specific environment:\n", "\n", " conda create -y -n stereopy-rapids -c rapidsai -c conda-forge -c nvidia python=3.8 rapids=23.04.01 cuda-version=11.8" ] }, { "cell_type": "markdown", "id": "5a8032a2", "metadata": {}, "source": [ "
\n", "\n", "**Note**\n", "\n", "My real experience installing CUDA on WSL successfully with *NVIDIA Studio Driver WHQL 522.30* according to this [bug](https://forums.developer.nvidia.com/t/cudaruntimeapierror-100-call-to-cudaruntimegetversion-results-in-cuda-error-no-device/234740) reporter’s advice. By the way, this is my personal PC environment with CUDA:\n", "* Intel Core i7-7700k\n", "* NVIDIA-GeForce-RTX-3060(NVIDIA-SMI 522.30; Driver Version: 522.30; CUDA Version: 11.8)\n", "* WSL2 on Windows10(21H2)\n", "\n", "
" ] }, { "cell_type": "markdown", "id": "edaba5a3", "metadata": {}, "source": [ "### Stereopy installation " ] }, { "cell_type": "markdown", "id": "4d7987c1", "metadata": {}, "source": [ "Installation through conda command fails in the environment with GPU acceleration, only [PyPI command](https://stereopy.readthedocs.io/en/latest/General/Installation.html#pypi) will succeed.\n", "\n", " pip install stereopy\n", "\n", "After installing stereopy, you will get some warnings about dependency conflicts, two of them must be reinstalled to correct version and others can be ignored.\n", "\n", " pip install dask==2023.3.2 distributed==2023.3.2.1" ] }, { "cell_type": "markdown", "id": "eb0bf7f6", "metadata": {}, "source": [ "## Clutsering with GPU" ] }, { "cell_type": "markdown", "id": "0b19591b", "metadata": {}, "source": [ "Start with common pipeline:" ] }, { "cell_type": "code", "execution_count": null, "id": "2c9603f5", "metadata": {}, "outputs": [], "source": [ "import stereo as st\n", "\n", "# reading data\n", "file_path = './stereopy/test/xujunhao/data/SS200000135TL_D1/SS200000135TL_D1.gef'\n", "bin_size = 50\n", "data = st.io.read_gef(file_path, bin_size=bin_size)\n", "\n", "# preprocessing\n", "data.tl.cal_qc()\n", "print(data.exp_matrix.shape)\n", "data.tl.normalize_total(target_sum=1e4)\n", "data.tl.log1p()\n", "\n", "# get highly variable genes\n", "data.tl.highly_variable_genes(min_mean=0.0125, max_mean=3, min_disp=0.5, res_key='highly_variable_genes', n_top_genes=None)\n", "\n", "# PCA\n", "data.tl.pca(use_highly_genes=True, hvg_res_key='highly_variable_genes', n_pcs=20, res_key='pca', svd_solver='arpack')" ] }, { "cell_type": "markdown", "id": "82046d45", "metadata": {}, "source": [ "Cluster demo by CPU:" ] }, { "cell_type": "code", "execution_count": null, "id": "41764d3e", "metadata": {}, "outputs": [], "source": [ "data.tl.neighbors(pca_res_key='pca', n_pcs=30, res_key='neighbors', n_jobs=8)\n", "data.tl.umap(pca_res_key='pca', neighbors_res_key='neighbors', res_key='umap', init_pos='spectral')\n", "data.tl.leiden(neighbors_res_key='neighbors', res_key='leiden')" ] }, { "cell_type": "markdown", "id": "9d07c5d3", "metadata": {}, "source": [ "Cluster demo by GPU:" ] }, { "cell_type": "code", "execution_count": null, "id": "52afa3c4", "metadata": {}, "outputs": [], "source": [ "# note that the parameter method is set to rapids\n", "\n", "data.tl.neighbors(pca_res_key='pca', n_pcs=30, res_key='neighbors', n_jobs=8, method='rapids')\n", "data.tl.umap(pca_res_key='pca', neighbors_res_key='neighbors', res_key='umap', init_pos='spectral', method='rapids')\n", "data.tl.leiden(neighbors_res_key='neighbors', res_key='leiden', method='rapids')" ] }, { "cell_type": "markdown", "id": "aa55e853", "metadata": {}, "source": [ "## GPU setting" ] }, { "cell_type": "markdown", "id": "562b7686", "metadata": {}, "source": [ "How to use GPU acceleration on Neighbors, UMAP and Leiden, is shown above, setting the parameter `method` to `rapids`. \n", "\n", "When cluster by Louvain using GPU acceleration, set `flavor` to `rapids`, as below:" ] }, { "cell_type": "code", "execution_count": null, "id": "486d0834", "metadata": {}, "outputs": [], "source": [ "data.tl.neighbors(pca_res_key='pca', n_pcs=30, res_key='neighbors', n_jobs=8, method='rapids')\n", "data.tl.louvain(neighbors_res_key='neighbors', res_key='louvain', flavor='rapids', use_weights=True)" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.8.15" }, "vscode": { "interpreter": { "hash": "70dbeb2a90198859cd91b6ea0f3adc73d66939fe301617b631d99dfc954c0323" } } }, "nbformat": 4, "nbformat_minor": 5 }