Skip to content

MosbahAouad/EarlyPDAC-MML

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

EarlyPDAC-MML: Early Detection of Pancreatic Cancer Using Multimodal Learning on Electronic Health Record

Early detection of pancreatic cancer using multimodal learning on EHR data. Combines NCDEs for lab panels, BioGPT-encoded diagnosis trajectories, and cross-attention for risk prediction up to 12 months before clinical diagnosis.

Installation

Clone the repository:

git clone https://github.com/MosbahAouad/EarlyPDAC-MML.git cd EarlyPDAC-MML

Install dependencies (recommended: use conda):

conda env create -f environment.yml conda activate pancreatic # or pip install -r requirements.txt

Data Format

Note: You must provide your own data. Example data format:

  • Lab panels: numpy arrays or tensors, shape [num_samples, num_timesteps, num_features]
  • Diagnosis codes: padded integer sequences, shape [num_samples, seq_length]
  • Labels: binary or multiclass, shape [num_samples]

See utils/data_utils.py and comments in scripts for details.

Usage

Train and evaluate the model:

python scripts/main_cross_validation.py --weights_dir ./weights --results_dir ./results --results_file results.csv --model_type combined

See args/arg_parser.py for all command-line options.

Citation

If you use this code, please cite:

@inproceedings{aouad2025early, title={Early Detection of Pancreatic Cancer Using Multimodal Learning on Electronic Health Record}, author={Aouad, Mosbah and Choudhary, Anirudh and Farooq, Awais and Nevers, Steven and Demirkhanyan, Lusine and Harris, Bhrandon and Pappu, Suguna and Gondi, Christopher and Iyer, Ravishankar}, booktitle={Proceedings of Machine Learning for Healthcare}, volume={298}, pages={1--22}, year={2025} } 

License

This project is licensed under the MIT License.

About

Early detection of pancreatic cancer using multimodal learning on EHR data. Combines NCDEs for lab panels, BioGPT-encoded diagnosis trajectories, and cross-attention for risk prediction up to 12 months before clinical diagnosis.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages