TumorImagingBench is a framework for extracting foundation model embeddings from medical images and benchmarking them across radiomics datasets.
Overview
- Unified interface for multiple foundation model extractors.
- Dataset-specific feature extraction pipelines.
- Analysis workflows in notebooks for performance, robustness, and stability.
Repository Structure
TumorImagingBench/ ├── src/tumorimagingbench/ # Core package (models, evaluation) ├── scripts/ # Utility scripts ├── tutorials/ # Tutorials and guides ├── notebooks/ # Analysis notebooks ├── data/ # Datasets (ignored by git) ├── dist/ # Large weights (ignored by git) ├── metrics/ # Evaluation outputs └── plots/ # Figures and plots Installation
uv sync uv run python -m pip install -e .Python requirement: >=3.10,<3.12.
Quickstart List available extractors:
from tumorimagingbench.models import get_available_extractors print(get_available_extractors())Load a model and initialize weights:
from tumorimagingbench.models import get_extractor Model = get_extractor("VISTA3DExtractor") model = Model() model.load()Feature Extraction Example using the LUNA16 extractor:
uv run python src/tumorimagingbench/evaluation/luna_feature_extractor.py \ --output features/luna.pkl \ --train-csv /path/to/train.csv \ --val-csv /path/to/val.csv \ --test-csv /path/to/test.csvNotes:
- Dataset CSVs should include
image_path,coordX,coordY,coordZ(and optional labels). - Many extractors ship with absolute default paths; override them via flags.
- Feature extraction expects a CUDA-capable GPU.
Supported Models
CTClipVitExtractorCTFMExtractorFMCIBExtractorMedImageInsightExtractorMerlinExtractorModelsGenExtractorPASTAExtractorSUPREMExtractorVISTA3DExtractorVocoExtractor
Supported Datasets
- LUNA16
- DLCS (Duke Lung Cancer Screening)
- NSCLC Radiomics
- NSCLC Radiogenomics
- C4C-KiTS
- Colorectal Liver Metastases
- LNDb
- RIDER (test-retest stability)
Tutorials
- See
tutorials/README.mdfor guided notebooks and dataset/model integration walkthroughs.
Evaluation
- Example modelling workflow:
notebooks/modelling/luna_modelling.ipynb(LUNA16 evaluation notebook). - Loads extracted features from
data/features/luna.pkland evaluates per-model performance. - Baselines: k-NN probing with AUC and 95% CI; linear probing (logistic regression); few-shot (1/5/10-shot).
- Visual outputs saved to
plots/(e.g.,luna_auc.png,luna_knn_overlap.png,luna_few_shot.png,luna_evaluation_protocols.png). - Ensemble methods: alignment-weighted k-NN and stacking meta-learner; weight and comparison plots (e.g.,
luna_alignment_weights.png,luna_stacking_weights.png,luna_ensemble_comparison.png,luna_ensemble_vs_individual.png). - Aggregates results into
overall_results.csv.
Contributing
- Follow the existing code style and update docs with changes.
- Add targeted tests for new functionality.
Citation
@article{TumorImagingBench, title={Foundation model embeddings for quantitative tumor imaging biomarkers}, author={}, journal={}, year={}, volume={}, pages={}, publisher={} }License MIT. See LICENSE.