Skip to content

owkin/miso_code

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

MISO: A deep learning-based multiscale integration of spatial omics with tumor morphology

Pre-print available on tba.

Table of Contents

How to install

This code relies on python 3.9. To install openslide, do:

apt-get update -qq && apt-get install openslide-tools libgeos-dev -y 2>&1

Then to install miso in a dedicated environment:

conda create --name miso_env python=3.9
conda activate miso_env
pip install -e .

Data collection and preprocessing

Downloading data

Public 10 Genomics Visium samples are available to download on 10 Genomics's website (filter by "Spatial Gene Expression"). For example purposes, we make available tools to process such samples and make available the outputs of the samples:

  • Human Colorectal Cancer, 11 mm Capture Area (FFPE).
  • Human Lung Cancer, 11 mm Capture Area (FFPE).
  • Human Ovarian Cancer, 11 mm Capture Area (FFPE).

For a given sample, you will need to download:

  • The filtered_feature_bc_matrix.h5
  • The spatial folder
  • The H&E image tissue_image.tif (be careful to use the full resolution image and not the CytAssist view, if you are using CytAssis samples)

The dataset HER2ST (Andersson et al) can be downloaded following instruction given in the repository.

Preprocessing of Visium data

  1. Re-write the H&E image in a pyramidal format compatible with the openslide library using the script miso/data/processing/rewrite_slide.py.

For instance, after downloading the Human Colorectal Cancer in PATH_TO_RAW_DATA, you can run

python miso/data/processing/rewrite_slide.py --path_visium PATH_TO_RAW_DATA/CytAssist_11mm_FFPE_Human_Colorectal_Cancer/spatial --path_slide PATH_TO_RAW_DATA/CytAssist_11mm_FFPE_Human_Colorectal_Cancer/CytAssist_11mm_FFPE_Human_Colorectal_Cancer_tissue_image.tif --path_output_folder PATH_TO_PROCESSED_DATA/CytAssist_11mm_FFPE_Human_Colorectal_Cancer

This will save the new file CytAssist_11mm_FFPE_Human_Colorectal_Cancer_tissue_image_pyr.tif in PATH_TO_PROCESSED_DATA/CytAssist_11mm_FFPE_Human_Colorectal_Cancer.

  1. Run the pre-processing script miso/scripts/process_data.py:

python miso/scripts/process_data.py --path_visium PATH_TO_RAW_DATA/CytAssist_11mm_FFPE_Human_Colorectal_Cancer/ --path_slide PATH_TO_PROCESSED_DATA/CytAssist_11mm_FFPE_Human_Colorectal_Cancer/CytAssist_11mm_FFPE_Human_Colorectal_Cancer_tissue_image_pyr.tif --path_output_folder PATH_TO_PROCESSED_DATA/CytAssist_11mm_FFPE_Human_Colorectal_Cancer --level 1 --knn 37

This script will:

  • Select 224 x 224 pixels tiles centered on each spots that passed Space Ranger's QC.
  • For each tile, we use a pre-trained ViT-16 feature extractor to extract features both at the tile level and at each patch of size 16 x 16 pixels. By default, we use the phikon model available on huggingface.
  • A list of neighbors for each tiles is computed.
  • Rewrite the counts into numpy files.

Preprocessing of HER2ST data

Once downloaded, the data folder PATH_TO_HER2ST_DATA contains four subfolder: count-matrices, images, meta and spot-selection. To extract tile and subtile features, run

python miso/scripts/process_her2st.py --path_dataset PATH_TO_HER2ST_DATA

This will create a fifth subfolder processed_data in PATH_TO_HER2ST_DATA.

Training

To train a model you can run the scripts miso/train.py.

To do so you can use a config file in the folder confs and specify it with the command-line argument --config-name, e.g.

python miso/train.py --config-name train_her2st.yaml

Benchmark

Performances of the models trained on HER2ST can be compared to the extensive benchmark carried by Wang et al. 1, using the source data provided in the paper. The default config train_her2st.yaml makes use of the same split, saved in miso/assets/splits_benchmark_her2st.pkl.

Distillation

Once a model is trained, you can use it to generate pseudolabels for distillation with miso/distillation/generate_distillation_labels.py.

For instance, to generate pseudolabels with a model trained with config miso/confs/train.yaml in the same folders as raw counts, run python miso/distillation/generate_distillation_labels.py --config-name=train.yaml

It is then possible to train a weakly-supervised model for super-resolved prediction of gene expression by launching

python miso/train.py --config-name=distil.yaml

Footnotes

  1. Wang, C., Chan, A. S., Fu, X., Ghazanfar, S., Kim, J., Patrick, E., & Yang, J. Y. (2025). Benchmarking the translational potential of spatial gene expression prediction from histology. Nature Communications, 16(1), 1544. (Link to the publication).

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published