{
"cells": [
{
"cell_type": "markdown",
"id": "bfcce7e2-f634-4ff7-9d58-ef54f0ab1d21",
"metadata": {
"tags": []
},
"source": [
"\n",
" \n",
""
]
},
{
"cell_type": "markdown",
"id": "83c79e29-152e-4b40-b093-5055175f0c54",
"metadata": {
"tags": []
},
"source": [
"Before you start, make sure to set your runtime type to GPU in colab."
]
},
{
"cell_type": "code",
"execution_count": 25,
"id": "8ee721de-3fd0-4cab-81bc-5298e4774bb9",
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"# Install SOFA + dependencies\n",
"!pip install --quiet biosofa"
]
},
{
"cell_type": "code",
"execution_count": 1,
"id": "869be535-67b2-42ef-999e-06f4dcbfa11d",
"metadata": {
"tags": []
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"--2024-11-06 09:37:23-- https://datasets.cellxgene.cziscience.com/484dbc33-c7dc-4e5e-9954-7f2a1cc849bc.h5ad\n",
"Resolving datasets.cellxgene.cziscience.com (datasets.cellxgene.cziscience.com)... 18.172.112.45, 18.172.112.61, 18.172.112.87, ...\n",
"Connecting to datasets.cellxgene.cziscience.com (datasets.cellxgene.cziscience.com)|18.172.112.45|:443... connected.\n",
"HTTP request sent, awaiting response... 200 OK\n",
"Length: 405964465 (387M) [binary/octet-stream]\n",
"Saving to: ‘484dbc33-c7dc-4e5e-9954-7f2a1cc849bc.h5ad’\n",
"\n",
"-c7dc-4e5e-9954-7f2 24%[===> ] 95.62M 22.3MB/s eta 17s ^C\n",
"--2024-11-06 09:37:30-- https://datasets.cellxgene.cziscience.com/fecbd715-66f5-48ae-8c39-51a76d7f1d3d.h5ad\n",
"Resolving datasets.cellxgene.cziscience.com (datasets.cellxgene.cziscience.com)... 18.172.112.45, 18.172.112.61, 18.172.112.87, ...\n",
"Connecting to datasets.cellxgene.cziscience.com (datasets.cellxgene.cziscience.com)|18.172.112.45|:443... connected.\n",
"HTTP request sent, awaiting response... 200 OK\n",
"Length: 906950703 (865M) [binary/octet-stream]\n",
"Saving to: ‘fecbd715-66f5-48ae-8c39-51a76d7f1d3d.h5ad’\n",
"\n",
" fecbd715-66f5-4 2%[ ] 20.59M 12.2MB/s ^C\n"
]
}
],
"source": [
"# download data \n",
"!wget https://datasets.cellxgene.cziscience.com/484dbc33-c7dc-4e5e-9954-7f2a1cc849bc.h5ad # RNA-Seq\n",
"!mv 484dbc33-c7dc-4e5e-9954-7f2a1cc849bc.h5ad rna.h5ad\n",
"!wget https://datasets.cellxgene.cziscience.com/fecbd715-66f5-48ae-8c39-51a76d7f1d3d.h5ad # ATAC-Seq\n",
"!mv fecbd715-66f5-48ae-8c39-51a76d7f1d3d.h5ad atac.h5ad"
]
},
{
"cell_type": "code",
"execution_count": 1,
"id": "c35b434b-71b8-4f6e-bab9-432e675dc720",
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"import warnings\n",
"warnings.filterwarnings('ignore')\n",
"\n",
"import sofa \n",
"import torch\n",
"import pandas as pd\n",
"import numpy as np\n",
"import matplotlib.pyplot as plt\n",
"import seaborn as sns\n",
"import matplotlib\n",
"from sklearn.preprocessing import StandardScaler, OneHotEncoder\n",
"from muon import MuData\n",
"from sklearn.manifold import TSNE\n",
"import matplotlib.patches as mpatches\n",
"import scanpy as sc\n",
"import anndata as ad\n",
"from anndata import AnnData\n",
"import muon\n",
"from matplotlib import colors as mp_colors"
]
},
{
"cell_type": "markdown",
"id": "0b8c6411-bf16-4db0-a582-2a730f42c50f",
"metadata": {},
"source": [
"# Analysis of a single-cell multiome data set\n",
"\n",
"## Introduction\n",
"\n",
"In this notebook we will explore how `SOFA` can be used to analyze multi-omics data from the DepMap [[1,2,3,4,5]](#1,#2,#3,#4,#5). \n",
"Here we give a brief introduction what the SOFA model does and what it can be used for. For a more \n",
"detailed description please refer to our preprint: https://doi.org/10.1101/2024.10.10.617527 \n",
"\n",
"\n",
"### The SOFA model\n",
"Given a set of real-valued data\n",
"matrices containing multi-omic measurements from overlapping samples (also called views),\n",
"along with sample-level guiding variables that capture additional properties such as batches\n",
"or mutational profiles, SOFA extracts an interpretable lower-dimensional data representation,\n",
"consisting of a shared factor matrix and modality-specific loading matrices. The goal of these \n",
"factors is to explain the major axes of variation in the data. SOFA explicitly assigns a subset of factors \n",
"to explain both the multi-omics data and the guiding\n",
"variables (guided factors), while preserving another subset of factors exclusively\n",
"for explaining the multi-omics data (unguided factors). Importantly, this feature allows the\n",
"analyst to discern variation that is driven by known sources from novel, unexplained sources\n",
"of variability.\n",
"\n",
"#### Interpretation of the factors (Z)\n",
"Analogous to the interpretation of factors in PCA, SOFA factors ordinate samples along a\n",
"zero-centered axis, where samples with opposing signs exhibit contrasting phenotypes along\n",
"the inferred axis of variation, and the absolute value of the factor indicates the strength of the\n",
"phenotype. Importantly, SOFA partitions the factors of the low-rank decomposition into\n",
"guided and unguided factors: the guided factors are linked to specific guiding variables,\n",
"while the unguided factors capture global, yet unexplained, sources of variability in the data. \n",
"The factor values can be used in downstream analysis tasks related to the samples, such as clustering \n",
"or survival analysis. The factor values are called Z in SOFA.\n",
"\n",
"#### Interpretation of the loading weights (W)\n",
"SOFA’s loading weights indicate the importance of each feature for its respective factor,\n",
"thereby enabling the interpretation of SOFA factors. Loading weights close to zero indicate\n",
"that a feature has little to no importance for the respective factor, while large magnitudes\n",
"suggest strong relevance. The sign of the loading weight aligns with its corresponding factor,\n",
"meaning that positive loading weights indicate higher feature levels in samples with positive\n",
"factor values, and negative loading weights indicate higher feature levels in samples with\n",
"negative factor values. The top loading weights can be simply inspected or used in downstream analysis such as gene set \n",
"enrichment analysis. The factor values are called W in SOFA.\n",
"\n",
"#### Supported data\n",
"SOFA expects a set of matrices containing omics measurements with matching and aligned samples and different features. \n",
"Currently SOFA only supports Gaussian likelihoods, for the multi-omics data. \n",
"Data should therefore be appropriately normalized according to\n",
"its omics modality. Additionally, data should be centered and scaled.\n",
"\n",
"\n",
"For the guiding variables SOFA supports Gaussian, Bernoulli and Categorical likelihoods. Guiding variables\n",
"can therefore be continuous, binary or categorical. Guiding variables should be vectors with matching samples with \n",
"the multi-omics data.\n",
"\n",
"In SOFA the multi-omics data is denoted as X and the guiding variables as Y.\n",
"\n",
"\n",
"### Single-cell multiome data set of the human cortex\n",
"The data we analyze in this notebook was generated by [[1]](#1). The authors simultaneously profiled the transcriptome (RNA) and chromatin accessibility (ATAC) of 45549 single cells of the human cerebral cortex at 6 different developmental stages. The authors identified 13 different cell types in the data. \n",
"We will fit a SOFA model with 15 factors and guide the first 13 factors with a different cell type label. The 13 guided factors will explain the molecular differences between the cell types, while the 2 unguided factors are free to explain within cell type variation.\n",
"We will first load the data and do some basic preprocessing, then fit a SOFA model and perform various downstream analyses. \n",
"\n",
"\n",
"\n",
"[1] \n",
"Zhu, K. et al. Multi-omic profiling of the developing human cerebral cortex at the single-cell level. Sci Adv 9, eadg3754 (2023)."
]
},
{
"cell_type": "markdown",
"id": "504b07e1-d3bb-4a86-a942-7b0b72a42ebe",
"metadata": {},
"source": [
"## Load and preprocess"
]
},
{
"cell_type": "code",
"execution_count": 4,
"id": "bf6a223e-1d72-4ed0-a0b5-eca6297e25d2",
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"adata_rna= sc.read_h5ad(\"rna.h5ad\") "
]
},
{
"cell_type": "code",
"execution_count": 5,
"id": "22a29adc-f1ad-4477-b344-c8b117871f74",
"metadata": {
"tags": []
},
"outputs": [
{
"data": {
"text/plain": [
"AnnData object with n_obs × n_vars = 45549 × 30113\n",
" obs: 'author_cell_type', 'age_group', 'donor_id', 'nCount_RNA', 'nFeature_RNA', 'nCount_ATAC', 'nFeature_ATAC', 'TSS_percentile', 'nucleosome_signal', 'percent_mt', 'assay_ontology_term_id', 'cell_type_ontology_term_id', 'development_stage_ontology_term_id', 'disease_ontology_term_id', 'self_reported_ethnicity_ontology_term_id', 'organism_ontology_term_id', 'sex_ontology_term_id', 'tissue_ontology_term_id', 'suspension_type', 'is_primary_data', 'batch', 'tissue_type', 'cell_type', 'assay', 'disease', 'organism', 'sex', 'tissue', 'self_reported_ethnicity', 'development_stage', 'observation_joinid'\n",
" var: 'feature_is_filtered', 'feature_name', 'feature_reference', 'feature_biotype', 'feature_length'\n",
" uns: 'batch_condition', 'citation', 'schema_reference', 'schema_version', 'title'\n",
" obsm: 'X_joint_wnn_umap', 'X_umap'"
]
},
"execution_count": 5,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"adata_rna"
]
},
{
"cell_type": "code",
"execution_count": 6,
"id": "70fac599-541a-45d8-a299-fff308f5882c",
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"# basic preprocessing\n",
"adata_rna.X = adata_rna.raw.X\n",
"# normalization to total library size\n",
"sc.pp.normalize_total(adata_rna)\n",
"# log transformation\n",
"sc.pp.log1p(adata_rna)\n"
]
},
{
"cell_type": "code",
"execution_count": 7,
"id": "5db1aed8-811d-4267-a787-ea8670059cdd",
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"adata_atac= sc.read_h5ad(\"atac.h5ad\")"
]
},
{
"cell_type": "code",
"execution_count": 8,
"id": "72cd8d6e-94fe-44ca-bc1d-9b5670d55325",
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"# select highly variable genes\n",
"sc.pp.highly_variable_genes(\n",
" adata_rna,\n",
" n_top_genes=2000,\n",
" flavor=\"seurat\",\n",
" subset=True\n",
")\n",
"sc.pp.highly_variable_genes(\n",
" adata_atac,\n",
" n_top_genes=2000,\n",
" flavor=\"seurat\",\n",
" subset=True\n",
")"
]
},
{
"cell_type": "code",
"execution_count": 9,
"id": "3122a30f-9435-415f-adcd-9e211f80bfa6",
"metadata": {},
"outputs": [],
"source": [
"# scale the data\n",
"sc.pp.scale(adata_rna)\n",
"sc.pp.scale(adata_atac)"
]
},
{
"cell_type": "markdown",
"id": "5732b26c-cd47-44a4-977a-855e2605f602",
"metadata": {
"tags": []
},
"source": [
"### Set up Xmdata for SOFA\n"
]
},
{
"cell_type": "markdown",
"id": "2bfedf64-61ff-464d-9d96-422426af0458",
"metadata": {
"tags": []
},
"source": [
"#### Manually\n",
"\n",
"SOFA requires the following slots in uns:\n",
"* llh: \"gaussian\" \n",
"Currently only the Gaussian likelihood for the multi-omics data is supported.\n",
"\n",
"and in obsm:\n",
"* mask: boolean vector of length number of samples that masks samples with missing values"
]
},
{
"cell_type": "code",
"execution_count": 10,
"id": "95776c24-65c4-4edc-9a51-8a1b4afdf4ec",
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"metadata = adata_rna.obs\n",
"adata_rna.uns[\"llh\"] = \"gaussian\"\n",
"adata_rna.X = adata_rna.X\n",
"adata_rna.obsm[\"mask\"] = ~np.any(np.isnan(adata_rna.X), axis=1)"
]
},
{
"cell_type": "code",
"execution_count": 11,
"id": "fa175d27-541b-4e74-ba57-e757e275af7c",
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"adata_atac.uns[\"llh\"] = \"gaussian\"\n",
"adata_atac.X = adata_atac.X\n",
"adata_atac.obsm[\"mask\"] = ~np.any(np.isnan(adata_atac.X), axis=1)\n",
"\n",
"adata_rna.var_names = adata_rna.var[\"feature_name\"]\n",
"adata_atac.var_names = adata_atac.var[\"feature_name\"]"
]
},
{
"cell_type": "markdown",
"id": "c3b33445-5dc5-4eaf-ae86-085aa54a30e4",
"metadata": {
"tags": []
},
"source": [
"#### Using SOFA's sofa.tl.get_ad()\n",
"Alternatively we can use SOFA's inbuilt sofa.tl.get_ad() function to create the appropriate `AnnData` object"
]
},
{
"cell_type": "code",
"execution_count": 12,
"id": "9403e080-3bce-4bf5-825e-965a77060c3f",
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"# First we convert the `AnnData` objects to dataframes\n",
"rna_df = adata_rna.to_df()\n",
"atac_df = adata_atac.to_df()"
]
},
{
"cell_type": "code",
"execution_count": 13,
"id": "79760024-0528-40ab-8051-333b4d15951d",
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"# Then use sofa.tl.get_ad() with default parameters\n",
"adata_atac = sofa.tl.get_ad(rna_df)\n",
"adata_rna = sofa.tl.get_ad(atac_df)"
]
},
{
"cell_type": "code",
"execution_count": 14,
"id": "05364be1-9e03-436a-ab4c-8c3c7f51221b",
"metadata": {
"tags": []
},
"outputs": [
{
"data": {
"text/html": [
"
MuData object with n_obs × n_vars = 45549 × 4000\n", " 2 modalities\n", " RNA:\t45549 x 2000\n", " uns:\t'llh', 'scaling_factor'\n", " obsm:\t'mask'\n", " ATAC:\t45549 x 2000\n", " uns:\t'llh', 'scaling_factor'\n", " obsm:\t'mask'" ], "text/plain": [ "MuData object with n_obs × n_vars = 45549 × 4000\n", " 2 modalities\n", " RNA:\t45549 x 2000\n", " uns:\t'llh', 'scaling_factor'\n", " obsm:\t'mask'\n", " ATAC:\t45549 x 2000\n", " uns:\t'llh', 'scaling_factor'\n", " obsm:\t'mask'" ] }, "execution_count": 14, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# wrap the individual `AnnData` objects in a `MuData`\n", "Xmdata = MuData({\"RNA\":adata_rna, \"ATAC\":adata_atac})\n", "Xmdata" ] }, { "cell_type": "markdown", "id": "c2abe170-5cf0-48c2-a349-d7adfdd306bd", "metadata": { "tags": [] }, "source": [ "### Set up Ymdata for SOFA\n", "\n", "As described in the introduction, we would like to guide the first 13 factors with the 13 cell type labels.\n", "To this end we first need to one hot encode the cell type labels. We make use of the OneHotEncoder of `scikit-learn`:" ] }, { "cell_type": "code", "execution_count": 15, "id": "4a9d7dbc-6e19-41c6-b471-1a0c28871675", "metadata": { "tags": [] }, "outputs": [ { "data": { "text/html": [ "
\n", " | astrocyte | \n", "caudal ganglionic eminence derived interneuron | \n", "endothelial cell | \n", "glutamatergic neuron | \n", "inhibitory interneuron | \n", "medial ganglionic eminence derived interneuron | \n", "microglial cell | \n", "neural progenitor cell | \n", "oligodendrocyte | \n", "oligodendrocyte precursor cell | \n", "pericyte | \n", "radial glial cell | \n", "vascular associated smooth muscle cell | \n", "
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "1.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "
1 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "1.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "
2 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "1.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "
3 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "1.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "
4 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "1.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "
... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "
45544 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "1.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "
45545 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "1.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "
45546 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "1.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "
45547 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "1.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "
45548 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "1.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "
45549 rows × 13 columns
\n", "MuData object with n_obs × n_vars = 45549 × 13\n", " 13 modalities\n", " astrocyte:\t45549 x 1\n", " uns:\t'llh', 'scaling_factor'\n", " obsm:\t'mask'\n", " caudal ganglionic eminence derived interneuron:\t45549 x 1\n", " uns:\t'llh', 'scaling_factor'\n", " obsm:\t'mask'\n", " endothelial cell:\t45549 x 1\n", " uns:\t'llh', 'scaling_factor'\n", " obsm:\t'mask'\n", " glutamatergic neuron:\t45549 x 1\n", " uns:\t'llh', 'scaling_factor'\n", " obsm:\t'mask'\n", " inhibitory interneuron:\t45549 x 1\n", " uns:\t'llh', 'scaling_factor'\n", " obsm:\t'mask'\n", " medial ganglionic eminence derived interneuron:\t45549 x 1\n", " uns:\t'llh', 'scaling_factor'\n", " obsm:\t'mask'\n", " microglial cell:\t45549 x 1\n", " uns:\t'llh', 'scaling_factor'\n", " obsm:\t'mask'\n", " neural progenitor cell:\t45549 x 1\n", " uns:\t'llh', 'scaling_factor'\n", " obsm:\t'mask'\n", " oligodendrocyte:\t45549 x 1\n", " uns:\t'llh', 'scaling_factor'\n", " obsm:\t'mask'\n", " oligodendrocyte precursor cell:\t45549 x 1\n", " uns:\t'llh', 'scaling_factor'\n", " obsm:\t'mask'\n", " pericyte:\t45549 x 1\n", " uns:\t'llh', 'scaling_factor'\n", " obsm:\t'mask'\n", " radial glial cell:\t45549 x 1\n", " uns:\t'llh', 'scaling_factor'\n", " obsm:\t'mask'\n", " vascular associated smooth muscle cell:\t45549 x 1\n", " uns:\t'llh', 'scaling_factor'\n", " obsm:\t'mask'" ], "text/plain": [ "MuData object with n_obs × n_vars = 45549 × 13\n", " 13 modalities\n", " astrocyte:\t45549 x 1\n", " uns:\t'llh', 'scaling_factor'\n", " obsm:\t'mask'\n", " caudal ganglionic eminence derived interneuron:\t45549 x 1\n", " uns:\t'llh', 'scaling_factor'\n", " obsm:\t'mask'\n", " endothelial cell:\t45549 x 1\n", " uns:\t'llh', 'scaling_factor'\n", " obsm:\t'mask'\n", " glutamatergic neuron:\t45549 x 1\n", " uns:\t'llh', 'scaling_factor'\n", " obsm:\t'mask'\n", " inhibitory interneuron:\t45549 x 1\n", " uns:\t'llh', 'scaling_factor'\n", " obsm:\t'mask'\n", " medial ganglionic eminence derived interneuron:\t45549 x 1\n", " uns:\t'llh', 'scaling_factor'\n", " obsm:\t'mask'\n", " microglial cell:\t45549 x 1\n", " uns:\t'llh', 'scaling_factor'\n", " obsm:\t'mask'\n", " neural progenitor cell:\t45549 x 1\n", " uns:\t'llh', 'scaling_factor'\n", " obsm:\t'mask'\n", " oligodendrocyte:\t45549 x 1\n", " uns:\t'llh', 'scaling_factor'\n", " obsm:\t'mask'\n", " oligodendrocyte precursor cell:\t45549 x 1\n", " uns:\t'llh', 'scaling_factor'\n", " obsm:\t'mask'\n", " pericyte:\t45549 x 1\n", " uns:\t'llh', 'scaling_factor'\n", " obsm:\t'mask'\n", " radial glial cell:\t45549 x 1\n", " uns:\t'llh', 'scaling_factor'\n", " obsm:\t'mask'\n", " vascular associated smooth muscle cell:\t45549 x 1\n", " uns:\t'llh', 'scaling_factor'\n", " obsm:\t'mask'" ] }, "execution_count": 17, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# wrap dictionary as `MuData`\n", "Ymdata = MuData(cell_type_dict)\n", "Ymdata" ] }, { "cell_type": "code", "execution_count": 18, "id": "dfa9a4e3-0dc1-4320-8c34-57f5796a9a8b", "metadata": { "tags": [] }, "outputs": [ { "data": { "text/plain": [ "tensor([[1., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],\n", " [0., 1., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],\n", " [0., 0., 1., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],\n", " [0., 0., 0., 1., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],\n", " [0., 0., 0., 0., 1., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],\n", " [0., 0., 0., 0., 0., 1., 0., 0., 0., 0., 0., 0., 0., 0., 0.],\n", " [0., 0., 0., 0., 0., 0., 1., 0., 0., 0., 0., 0., 0., 0., 0.],\n", " [0., 0., 0., 0., 0., 0., 0., 1., 0., 0., 0., 0., 0., 0., 0.],\n", " [0., 0., 0., 0., 0., 0., 0., 0., 1., 0., 0., 0., 0., 0., 0.],\n", " [0., 0., 0., 0., 0., 0., 0., 0., 0., 1., 0., 0., 0., 0., 0.],\n", " [0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 1., 0., 0., 0., 0.],\n", " [0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 1., 0., 0., 0.],\n", " [0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 1., 0., 0.]],\n", " dtype=torch.float64)" ] }, "execution_count": 18, "metadata": {}, "output_type": "execute_result" } ], "source": [ "num_factors = 15\n", "# In order to relate factors to guiding variables we need to provide a design matrix (guiding variables x number of factors) \n", "# indicating which factor is guided by which guiding variable.\n", "# Here we just indicate that the first 6 factors are each guided by a different guiding variable:\n", "design = np.zeros((len(Ymdata.mod), num_factors))\n", "for i in range(len(Ymdata.mod)):\n", " design[i,i] = 1\n", "# convert to torch tensor to make it usable by SOFA\n", "design = torch.tensor(design)\n", "design" ] }, { "cell_type": "markdown", "id": "147125d8-fde7-42ea-ad8b-3c76940cf1c6", "metadata": {}, "source": [ "## Fit the `SOFA` model" ] }, { "cell_type": "code", "execution_count": 93, "id": "1f3cd265-7f88-49c1-a3b1-027f52d30d71", "metadata": { "tags": [] }, "outputs": [], "source": [ "model = sofa.SOFA(Xmdata = Xmdata, # the input multi-omics data \n", " num_factors=num_factors, # number of factors to infer\n", " Ymdata = Ymdata, # the input guiding variables\n", " design = design, # design matrix relating factors to guiding variables\n", " device='cuda', # set device to \"cuda\" to enable computation on the GPU, if you don't have a GPU available set it to \"cpu\"\n", " subsample=2048, # for single-cell data it can be beneficial to subsample minibatches (here of size 2048) of the data for training, this speeds up the fitting process \n", " seed=42) # set seed to get the same results every time we run it" ] }, { "cell_type": "code", "execution_count": 94, "id": "4472ee3d-6993-4ade-81e0-811a8d939ec8", "metadata": { "tags": [] }, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "Current Elbo 2.50E+08 | Delta: -1523794: 100%|██████████| 6000/6000 [14:14<00:00, 7.02it/s] \n" ] } ], "source": [ "model.fit(n_steps=6000, lr=0.01, predict = True)" ] }, { "cell_type": "code", "execution_count": 2, "id": "59b23b32-dc8c-4e60-bca4-4d14af700a78", "metadata": { "tags": [] }, "outputs": [], "source": [ "# if we would like to save the fitted model we can save it using:\n", "#sofa.tl.save_model(model,\"brain_example_model\")\n", "\n", "# to load the model use:\n", "#model = sofa.tl.load_model(\"brain_example_model\")" ] }, { "cell_type": "markdown", "id": "6c107ac0-1580-49e3-8f45-ccabe1174963", "metadata": {}, "source": [ "## Downstream analysis\n", "\n", "\n", "### Convergence\n", "\n", "We will first assess whether the ELBO loss of SOFA has converged by plotting it over training steps" ] }, { "cell_type": "code", "execution_count": 19, "id": "a4c953f8-6446-4cc5-8454-b7377ae8f8b9", "metadata": { "tags": [] }, "outputs": [ { "data": { "text/plain": [ "Text(0, 0.5, 'ELBO')" ] }, "execution_count": 19, "metadata": {}, "output_type": "execute_result" }, { "data": { "image/png": "", "text/plain": [ "
\n", " | Factor_1 (astrocyte) | \n", "Factor_2 (caudal ganglionic eminence derived interneuron) | \n", "Factor_3 (endothelial cell) | \n", "Factor_4 (glutamatergic neuron) | \n", "Factor_5 (inhibitory interneuron) | \n", "Factor_6 (medial ganglionic eminence derived interneuron) | \n", "Factor_7 (microglial cell) | \n", "Factor_8 (neural progenitor cell) | \n", "Factor_9 (oligodendrocyte) | \n", "Factor_10 (oligodendrocyte precursor cell) | \n", "Factor_11 (pericyte) | \n", "Factor_12 (radial glial cell) | \n", "Factor_13 (vascular associated smooth muscle cell) | \n", "Factor_14 | \n", "Factor_15 | \n", "
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
4_AAACAGCCAACACTTG-1 | \n", "0.611896 | \n", "-0.506728 | \n", "0.283791 | \n", "1.593037 | \n", "0.168398 | \n", "0.399242 | \n", "0.422193 | \n", "0.024536 | \n", "-0.414130 | \n", "-0.239863 | \n", "-0.121763 | \n", "0.381870 | \n", "0.133415 | \n", "-0.268314 | \n", "-0.236205 | \n", "
4_AAACAGCCACCAAAGG-1 | \n", "0.369894 | \n", "-0.695359 | \n", "0.196381 | \n", "1.779255 | \n", "0.204215 | \n", "0.662641 | \n", "0.385849 | \n", "-0.214301 | \n", "-0.265921 | \n", "-0.233039 | \n", "0.073905 | \n", "0.207890 | \n", "-0.089338 | \n", "-0.538529 | \n", "-0.470559 | \n", "
4_AAACAGCCATAAGTTC-1 | \n", "0.525656 | \n", "-0.576934 | \n", "0.151626 | \n", "2.239841 | \n", "0.164921 | \n", "0.566175 | \n", "0.381240 | \n", "-0.492999 | \n", "-0.334595 | \n", "-0.270106 | \n", "-0.003286 | \n", "0.406507 | \n", "-0.017237 | \n", "0.213183 | \n", "-0.319585 | \n", "
4_AAACATGCATAGTCAT-1 | \n", "0.484234 | \n", "-0.433456 | \n", "0.094686 | \n", "1.731916 | \n", "0.098722 | \n", "0.500609 | \n", "0.417679 | \n", "-0.256506 | \n", "-0.528553 | \n", "-0.414691 | \n", "-0.151233 | \n", "0.122768 | \n", "-0.096484 | \n", "-0.026044 | \n", "-0.005257 | \n", "
4_AAACATGCATTGTCAG-1 | \n", "0.473560 | \n", "-0.469949 | \n", "0.194220 | \n", "2.000594 | \n", "0.305656 | \n", "0.425527 | \n", "0.379652 | \n", "-0.437305 | \n", "-0.143152 | \n", "-0.080377 | \n", "0.052171 | \n", "0.357325 | \n", "-0.041771 | \n", "-0.080069 | \n", "-0.247479 | \n", "
... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "
150666_TTTGTGAAGACAACAG-1 | \n", "0.338253 | \n", "0.274610 | \n", "0.021684 | \n", "-0.642276 | \n", "0.297069 | \n", "-0.107472 | \n", "0.752773 | \n", "-0.331871 | \n", "1.272615 | \n", "-0.831026 | \n", "-1.536213 | \n", "0.170395 | \n", "1.086244 | \n", "0.420269 | \n", "1.975964 | \n", "
150666_TTTGTGAAGGCTGTGC-1 | \n", "0.088728 | \n", "0.120536 | \n", "0.196474 | \n", "-0.235622 | \n", "0.401305 | \n", "0.021225 | \n", "0.439250 | \n", "-0.194579 | \n", "-1.017107 | \n", "1.676835 | \n", "-1.030105 | \n", "0.552844 | \n", "0.437842 | \n", "0.196519 | \n", "1.003077 | \n", "
150666_TTTGTGAAGTAAGAAC-1 | \n", "0.254422 | \n", "-0.270921 | \n", "0.105018 | \n", "-0.476804 | \n", "0.007705 | \n", "0.118286 | \n", "0.191205 | \n", "-0.072784 | \n", "1.845788 | \n", "-0.173360 | \n", "-0.375505 | \n", "0.115170 | \n", "0.340696 | \n", "-0.138671 | \n", "0.186792 | \n", "
150666_TTTGTGAAGTCTTGAA-1 | \n", "0.707852 | \n", "-0.214515 | \n", "0.089128 | \n", "-0.605852 | \n", "0.340981 | \n", "0.148991 | \n", "0.792436 | \n", "-0.148444 | \n", "2.024130 | \n", "-0.886393 | \n", "-1.443925 | \n", "0.235342 | \n", "0.654023 | \n", "0.699232 | \n", "1.188664 | \n", "
150666_TTTGTTGGTGATCAGC-1 | \n", "0.890338 | \n", "-0.290386 | \n", "0.247877 | \n", "-0.622976 | \n", "0.332508 | \n", "0.064684 | \n", "1.110440 | \n", "-0.020911 | \n", "2.552641 | \n", "-1.316348 | \n", "-1.953289 | \n", "0.298762 | \n", "1.031413 | \n", "1.323803 | \n", "1.993683 | \n", "
45549 rows × 15 columns
\n", "feature_name | \n", "SHOX | \n", "CSF2RA | \n", "P2RY8 | \n", "CD99 | \n", "XG | \n", "GYG2 | \n", "ARSF | \n", "MXRA5 | \n", "PRKX | \n", "STS | \n", "... | \n", "MX2 | \n", "TMPRSS2 | \n", "TMPRSS3 | \n", "UBASH3A | \n", "TRPM2 | \n", "TSPEAR | \n", "KRTAP12-3 | \n", "ITGB2 | \n", "COL18A1 | \n", "COL6A2 | \n", "
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Factor_1 (astrocyte) | \n", "-0.018227 | \n", "0.016628 | \n", "0.008709 | \n", "-0.085218 | \n", "-0.032678 | \n", "-0.051249 | \n", "-0.128016 | \n", "-0.017553 | \n", "0.020427 | \n", "0.006717 | \n", "... | \n", "-0.013714 | \n", "-0.118789 | \n", "-0.051786 | \n", "-0.068351 | \n", "-0.008388 | \n", "-0.070681 | \n", "-0.040730 | \n", "-0.034352 | \n", "-0.009784 | \n", "-0.041832 | \n", "
Factor_2 (caudal ganglionic eminence derived interneuron) | \n", "0.061711 | \n", "0.025978 | \n", "0.041309 | \n", "0.008997 | \n", "0.103602 | \n", "0.066791 | \n", "0.096115 | \n", "0.026817 | \n", "0.018299 | \n", "0.073380 | \n", "... | \n", "0.063504 | \n", "0.054544 | \n", "0.064555 | \n", "0.087258 | \n", "0.127651 | \n", "0.140096 | \n", "0.034622 | \n", "0.046414 | \n", "0.055640 | \n", "0.089154 | \n", "
Factor_3 (endothelial cell) | \n", "0.002676 | \n", "0.046630 | \n", "-0.063564 | \n", "0.013813 | \n", "-0.022183 | \n", "0.019438 | \n", "0.019055 | \n", "-0.016783 | \n", "-0.012183 | \n", "-0.014433 | \n", "... | \n", "-0.028807 | \n", "0.011751 | \n", "-0.020785 | \n", "-0.045028 | \n", "0.020811 | \n", "-0.038803 | \n", "-0.009141 | \n", "0.003219 | \n", "-0.052285 | \n", "-0.009642 | \n", "
Factor_4 (glutamatergic neuron) | \n", "0.044140 | \n", "-0.032026 | \n", "-0.020472 | \n", "-0.004024 | \n", "0.047451 | \n", "0.072089 | \n", "0.020697 | \n", "0.033083 | \n", "-0.000998 | \n", "0.157253 | \n", "... | \n", "0.066016 | \n", "0.010713 | \n", "0.008618 | \n", "0.068955 | \n", "0.070585 | \n", "0.160543 | \n", "0.067635 | \n", "-0.021258 | \n", "-0.048109 | \n", "0.078915 | \n", "
Factor_5 (inhibitory interneuron) | \n", "0.021449 | \n", "0.076019 | \n", "0.071614 | \n", "0.111477 | \n", "0.054973 | \n", "0.011741 | \n", "-0.012790 | \n", "-0.002646 | \n", "-0.000420 | \n", "0.029649 | \n", "... | \n", "0.103795 | \n", "0.108892 | \n", "0.043622 | \n", "0.085356 | \n", "0.144796 | \n", "0.155512 | \n", "0.069746 | \n", "0.102761 | \n", "0.190089 | \n", "0.099793 | \n", "
Factor_6 (medial ganglionic eminence derived interneuron) | \n", "-0.002214 | \n", "0.011589 | \n", "0.009022 | \n", "0.005203 | \n", "-0.001285 | \n", "0.022863 | \n", "0.043369 | \n", "-0.009789 | \n", "-0.055886 | \n", "-0.051018 | \n", "... | \n", "-0.000926 | \n", "0.001854 | \n", "0.013125 | \n", "-0.046666 | \n", "-0.175756 | \n", "-0.010585 | \n", "0.021193 | \n", "0.029928 | \n", "0.040288 | \n", "-0.004638 | \n", "
Factor_7 (microglial cell) | \n", "0.033687 | \n", "-0.371200 | \n", "-0.201232 | \n", "-0.180759 | \n", "0.048073 | \n", "0.049734 | \n", "0.059747 | \n", "0.044436 | \n", "-0.062388 | \n", "0.013768 | \n", "... | \n", "-0.146758 | \n", "0.037564 | \n", "0.033068 | \n", "0.012605 | \n", "-0.099360 | \n", "0.029247 | \n", "0.020464 | \n", "-0.306571 | \n", "-0.021651 | \n", "0.055954 | \n", "
Factor_8 (neural progenitor cell) | \n", "-0.040064 | \n", "0.018456 | \n", "-0.022026 | \n", "0.028713 | \n", "0.042819 | \n", "-0.024957 | \n", "-0.036244 | \n", "-0.004213 | \n", "-0.068642 | \n", "-0.052252 | \n", "... | \n", "0.021161 | \n", "0.001159 | \n", "0.014907 | \n", "0.055635 | \n", "0.011421 | \n", "0.031352 | \n", "0.032900 | \n", "0.015596 | \n", "-0.012561 | \n", "0.020977 | \n", "
Factor_9 (oligodendrocyte) | \n", "-0.045553 | \n", "-0.086439 | \n", "-0.095924 | \n", "-0.091932 | \n", "-0.071380 | \n", "-0.057834 | \n", "-0.062374 | \n", "-0.035606 | \n", "-0.074925 | \n", "-0.039691 | \n", "... | \n", "-0.082786 | \n", "-0.074244 | \n", "-0.033555 | \n", "-0.044907 | \n", "-0.114517 | \n", "-0.094449 | \n", "-0.031309 | \n", "-0.104703 | \n", "-0.005420 | \n", "-0.073570 | \n", "
Factor_10 (oligodendrocyte precursor cell) | \n", "0.013328 | \n", "-0.076632 | \n", "-0.095766 | \n", "-0.061523 | \n", "0.004489 | \n", "-0.009986 | \n", "-0.035331 | \n", "-0.014284 | \n", "-0.010431 | \n", "-0.086027 | \n", "... | \n", "-0.005166 | \n", "0.103091 | \n", "0.000971 | \n", "-0.002350 | \n", "-0.054968 | \n", "-0.038637 | \n", "-0.016010 | \n", "-0.023825 | \n", "0.012225 | \n", "-0.018629 | \n", "
Factor_11 (pericyte) | \n", "-0.068400 | \n", "-0.117113 | \n", "-0.107469 | \n", "-0.063415 | \n", "-0.119831 | \n", "-0.101984 | \n", "-0.116682 | \n", "-0.038292 | \n", "-0.105895 | \n", "-0.142456 | \n", "... | \n", "-0.074856 | \n", "-0.127596 | \n", "-0.068352 | \n", "-0.096209 | \n", "-0.098300 | \n", "-0.039917 | \n", "0.005252 | \n", "-0.071315 | \n", "0.125981 | \n", "0.006964 | \n", "
Factor_12 (radial glial cell) | \n", "0.013457 | \n", "0.035862 | \n", "0.016619 | \n", "-0.008517 | \n", "-0.005948 | \n", "-0.048667 | \n", "-0.013628 | \n", "-0.006250 | \n", "0.036617 | \n", "0.091800 | \n", "... | \n", "0.037933 | \n", "0.001341 | \n", "0.018385 | \n", "0.011399 | \n", "0.078934 | \n", "0.081725 | \n", "0.034904 | \n", "0.064347 | \n", "0.076669 | \n", "0.018390 | \n", "
Factor_13 (vascular associated smooth muscle cell) | \n", "-0.020864 | \n", "-0.006278 | \n", "0.057538 | \n", "0.021366 | \n", "0.016715 | \n", "-0.011897 | \n", "0.036587 | \n", "-0.026533 | \n", "0.022503 | \n", "0.039456 | \n", "... | \n", "0.015130 | \n", "-0.010799 | \n", "0.014685 | \n", "0.013097 | \n", "-0.001941 | \n", "0.011595 | \n", "0.008483 | \n", "0.012239 | \n", "-0.133319 | \n", "-0.082095 | \n", "
Factor_14 | \n", "-0.030205 | \n", "-0.065338 | \n", "-0.098833 | \n", "-0.098750 | \n", "-0.020684 | \n", "0.015680 | \n", "0.004504 | \n", "0.002585 | \n", "-0.049090 | \n", "0.002668 | \n", "... | \n", "-0.099198 | \n", "-0.061864 | \n", "-0.059691 | \n", "-0.070308 | \n", "-0.167516 | \n", "-0.050790 | \n", "-0.045526 | \n", "-0.186547 | \n", "0.030531 | \n", "-0.045366 | \n", "
Factor_15 | \n", "0.102394 | \n", "0.111089 | \n", "0.161223 | \n", "0.145873 | \n", "0.143792 | \n", "0.096924 | \n", "0.109016 | \n", "0.141172 | \n", "0.169392 | \n", "0.152441 | \n", "... | \n", "0.147098 | \n", "0.099834 | \n", "0.066984 | \n", "0.102654 | \n", "0.077644 | \n", "0.146752 | \n", "0.049644 | \n", "0.127989 | \n", "0.083689 | \n", "0.177638 | \n", "
15 rows × 2000 columns
\n", "feature_name | \n", "HES5 | \n", "PRDM16 | \n", "LINC01134 | \n", "SLC2A5 | \n", "PIK3CD | \n", "TNFRSF1B | \n", "AADACL4 | \n", "SLC25A34-AS1 | \n", "PADI2 | \n", "PADI1 | \n", "... | \n", "LINC00279 | \n", "MT-ND1 | \n", "MT-ND2 | \n", "MT-CO1 | \n", "MT-CO2 | \n", "MT-ATP6 | \n", "MT-ND3 | \n", "MT-ND4L | \n", "MT-ND4 | \n", "MT-ND5 | \n", "
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Factor_1 (astrocyte) | \n", "-0.208182 | \n", "-0.429047 | \n", "-0.000131 | \n", "0.045786 | \n", "0.080833 | \n", "0.054359 | \n", "-0.012543 | \n", "-0.025898 | \n", "-0.091752 | \n", "0.003296 | \n", "... | \n", "0.036371 | \n", "-0.049887 | \n", "-0.039553 | \n", "0.012043 | \n", "-0.031498 | \n", "-0.078227 | \n", "-0.136771 | \n", "-0.020041 | \n", "-0.042751 | \n", "-0.024138 | \n", "
Factor_2 (caudal ganglionic eminence derived interneuron) | \n", "-0.036767 | \n", "-0.069018 | \n", "0.001598 | \n", "-0.011377 | \n", "-0.025387 | \n", "-0.038779 | \n", "0.004101 | \n", "0.004330 | \n", "-0.035379 | \n", "0.003445 | \n", "... | \n", "0.001888 | \n", "0.138144 | \n", "0.123199 | \n", "0.152745 | \n", "0.134151 | \n", "0.095955 | \n", "0.093427 | \n", "0.094417 | \n", "0.135157 | \n", "0.092551 | \n", "
Factor_3 (endothelial cell) | \n", "0.026845 | \n", "0.086176 | \n", "-0.006730 | \n", "0.032492 | \n", "0.092398 | \n", "-0.062162 | \n", "0.009899 | \n", "0.019996 | \n", "0.012182 | \n", "0.006935 | \n", "... | \n", "0.010711 | \n", "0.007029 | \n", "-0.003770 | \n", "0.000740 | \n", "0.000596 | \n", "-0.013372 | \n", "-0.011830 | \n", "0.009384 | \n", "-0.013850 | \n", "-0.012648 | \n", "
Factor_4 (glutamatergic neuron) | \n", "-0.119810 | \n", "-0.110515 | \n", "-0.008428 | \n", "-0.065243 | \n", "-0.076141 | \n", "-0.082045 | \n", "0.001842 | \n", "-0.012405 | \n", "-0.123653 | \n", "-0.004981 | \n", "... | \n", "-0.027927 | \n", "-0.064613 | \n", "-0.099515 | \n", "-0.053974 | \n", "-0.108447 | \n", "-0.121445 | \n", "-0.160079 | \n", "-0.058071 | \n", "-0.102936 | \n", "-0.070332 | \n", "
Factor_5 (inhibitory interneuron) | \n", "0.074077 | \n", "0.080980 | \n", "-0.000610 | \n", "0.054926 | \n", "0.059176 | \n", "0.070569 | \n", "-0.001754 | \n", "-0.002609 | \n", "0.111904 | \n", "0.008717 | \n", "... | \n", "0.032175 | \n", "0.144616 | \n", "0.140038 | \n", "0.144284 | \n", "0.186204 | \n", "0.150951 | \n", "0.152748 | \n", "0.052082 | \n", "0.145854 | \n", "0.084795 | \n", "
Factor_6 (medial ganglionic eminence derived interneuron) | \n", "0.061534 | \n", "0.110697 | \n", "-0.012850 | \n", "0.014597 | \n", "-0.108322 | \n", "0.006335 | \n", "-0.001688 | \n", "0.008390 | \n", "0.036388 | \n", "-0.003137 | \n", "... | \n", "0.001910 | \n", "-0.249092 | \n", "-0.228440 | \n", "-0.332192 | \n", "-0.354922 | \n", "-0.283303 | \n", "-0.202077 | \n", "-0.081255 | \n", "-0.252142 | \n", "-0.182548 | \n", "
Factor_7 (microglial cell) | \n", "0.073273 | \n", "0.076353 | \n", "-0.015753 | \n", "-0.372691 | \n", "-0.235887 | \n", "-0.319122 | \n", "-0.010101 | \n", "-0.001108 | \n", "-0.065391 | \n", "-0.022516 | \n", "... | \n", "0.011044 | \n", "0.019545 | \n", "0.015750 | \n", "0.001131 | \n", "0.000565 | \n", "-0.025114 | \n", "-0.020271 | \n", "0.019034 | \n", "0.008098 | \n", "0.033560 | \n", "
Factor_8 (neural progenitor cell) | \n", "0.094023 | \n", "-0.073044 | \n", "-0.020015 | \n", "-0.010783 | \n", "0.006819 | \n", "-0.035352 | \n", "-0.022101 | \n", "0.004570 | \n", "0.003941 | \n", "0.013296 | \n", "... | \n", "0.004650 | \n", "0.018530 | \n", "0.006067 | \n", "0.097694 | \n", "0.021061 | \n", "-0.002756 | \n", "-0.011136 | \n", "-0.008039 | \n", "0.041842 | \n", "0.008941 | \n", "
Factor_9 (oligodendrocyte) | \n", "-0.096187 | \n", "-0.090426 | \n", "0.002811 | \n", "-0.057448 | \n", "-0.100601 | \n", "-0.064315 | \n", "0.001830 | \n", "0.006622 | \n", "0.256400 | \n", "0.001876 | \n", "... | \n", "0.140956 | \n", "-0.023779 | \n", "0.035511 | \n", "0.045417 | \n", "0.096452 | \n", "0.008418 | \n", "0.016682 | \n", "-0.010833 | \n", "0.055347 | \n", "-0.018010 | \n", "
Factor_10 (oligodendrocyte precursor cell) | \n", "0.109106 | \n", "-0.139850 | \n", "-0.001256 | \n", "-0.085660 | \n", "-0.103560 | \n", "-0.073114 | \n", "-0.011707 | \n", "-0.001273 | \n", "-0.034249 | \n", "0.001216 | \n", "... | \n", "-0.023487 | \n", "-0.091286 | \n", "-0.058322 | \n", "-0.080787 | \n", "-0.033584 | \n", "-0.034352 | \n", "-0.039314 | \n", "-0.031082 | \n", "-0.040277 | \n", "-0.050600 | \n", "
Factor_11 (pericyte) | \n", "-0.096806 | \n", "-0.082905 | \n", "-0.013746 | \n", "-0.031378 | \n", "0.182597 | \n", "0.024180 | \n", "0.004992 | \n", "0.009840 | \n", "-0.045573 | \n", "0.001019 | \n", "... | \n", "-0.022335 | \n", "0.069674 | \n", "0.102980 | \n", "0.051257 | \n", "0.088180 | \n", "0.073444 | \n", "0.077907 | \n", "0.039346 | \n", "0.044237 | \n", "0.039888 | \n", "
Factor_12 (radial glial cell) | \n", "-0.305815 | \n", "-0.370938 | \n", "0.014988 | \n", "0.027542 | \n", "0.026361 | \n", "0.024751 | \n", "-0.001839 | \n", "0.037604 | \n", "0.079338 | \n", "0.000370 | \n", "... | \n", "0.007548 | \n", "0.139923 | \n", "0.137401 | \n", "0.122713 | \n", "0.170129 | \n", "0.160537 | \n", "0.122313 | \n", "0.048193 | \n", "0.146312 | \n", "0.089201 | \n", "
Factor_13 (vascular associated smooth muscle cell) | \n", "0.097635 | \n", "0.021009 | \n", "-0.025192 | \n", "0.014730 | \n", "0.082244 | \n", "0.057195 | \n", "-0.006985 | \n", "-0.018558 | \n", "0.041190 | \n", "-0.010776 | \n", "... | \n", "0.011489 | \n", "0.117726 | \n", "0.129374 | \n", "0.112941 | \n", "0.148383 | \n", "0.155056 | \n", "0.177456 | \n", "0.069603 | \n", "0.138858 | \n", "0.053408 | \n", "
Factor_14 | \n", "0.064157 | \n", "-0.044716 | \n", "-0.024030 | \n", "0.007258 | \n", "0.054465 | \n", "-0.057635 | \n", "0.013054 | \n", "-0.013529 | \n", "0.020315 | \n", "0.006386 | \n", "... | \n", "0.040675 | \n", "-0.076650 | \n", "-0.030133 | \n", "-0.004744 | \n", "-0.037376 | \n", "-0.099526 | \n", "-0.079796 | \n", "0.008935 | \n", "-0.054390 | \n", "-0.010699 | \n", "
Factor_15 | \n", "0.004490 | \n", "-0.062551 | \n", "0.004475 | \n", "0.026869 | \n", "0.085769 | \n", "0.083807 | \n", "0.021577 | \n", "-0.018813 | \n", "-0.007668 | \n", "-0.005300 | \n", "... | \n", "-0.029235 | \n", "0.010413 | \n", "0.031491 | \n", "0.010445 | \n", "0.000602 | \n", "0.039254 | \n", "0.070796 | \n", "-0.004739 | \n", "0.025249 | \n", "-0.005580 | \n", "
15 rows × 2000 columns
\n", "