Implementation of Window-Based Comparison (WBC) - a membership inference attack against fine-tuned Large Language Models using localized window-based analysis.
# Clone repository git clone https://github.com/Stry233/WBC cd wbc-attack # Create environment python -m venv venv source venv/bin/activate # Install dependencies pip install -r requirements.txtCreate balanced member/non-member splits from a HuggingFace dataset:
python dataset/prep.py \ --dataset_name "HuggingFaceTB/cosmopedia" \ --config "khanacademy" \ --num_samples 20000 \ --min_length 512 \ --output_dir "cosmopedia-khanacademy-subset"This creates train.json (members) and test.json (non-members) in the output directory.
Fine-tune a model on the member data:
python trainer/get_target.py \ --config_path configs/config_all.yaml \ --base_path ./weights \ --train_subset_size 10000 \ --ref_subset_size 10000To get the yaml file for your setup, please modify and run \trainer\configs\prep.py based on instructions in that file.
Execute WBC and baseline attacks:
python run.py \ --config configs/config_all.yaml \ --output results/ \ --base-dir ./weights \ --seed 42global: target_model: "./path/to/target" reference_model_path: "EleutherAI/pythia-2.8b" datasets: - json_train_path: "data/train.json" json_test_path: "data/test.json" batch_size: 1 max_length: 512 fpr_thresholds: [0.1, 0.01, 0.001] n_bootstrap_samples: 100 # WBC attack settings Wbc: module: "wbc" reference_model_path: "EleutherAI/pythia-2.8b" context_window_lengths: [2, 3, 4, 6, 9, 13, 18, 25, 32, 40]Enable/disable attacks by commenting them in configs/config_all.yaml:
# Reference-free attacks loss: module: loss zlib: module: zlib # Reference-based attacks ratio: module: ratio reference_model_path: "EleutherAI/pythia-2.8b" # Our method Wbc: module: "wbc" # ... configurationpython dataset/prep.py \ --dataset_name "your_dataset" \ --text_column "text" \ --split "train" \ --num_sample 20000 \ --min_length 512 \ --tokenizer_name "EleutherAI/pythia-2.8b"To add a new attack, create a file in attacks/:
from attacks import AbstractAttack class YourAttack(AbstractAttack): def __init__(self, name, model, tokenizer, config, device): super().__init__(name, model, tokenizer, config, device) def _process_batch(self, batch): # Implement your attack logic scores = compute_membership_scores(batch) return {self.name: scores}Then add to configs/config_all.yaml:
your_attack: module: your_attack # your parametersThe attack produces:
-
Metadata file:
metadata_[timestamp]_[config].pklcontaining:- Attack scores for all methods
- Ground truth labels
- AUC and TPR metrics
- Configuration details
-
Console output: Results table with AUC and TPR@FPR metrics
├── attacks/ # Attack implementations │ ├── wbc.py # WBC attack implementation │ └── misc/ │ └── utils.py # Loss computation utilities ├── trainer/ # Model fine-tuning │ ├── get_target.py # Main training script │ └── configs/ # Training configurations ├── configs/ # Attack configurations ├── dataset/ # Dataset preparation ├── scripts/ # Automation scripts ├── run.py # Main attack runner └── utils.py # Shared utilities @misc{chen2026windowbasedmembershipinferenceattacks, title={Window-based Membership Inference Attacks Against Fine-tuned Large Language Models}, author={Yuetian Chen and Yuntao Du and Kaiyuan Zhang and Ashish Kundu and Charles Fleming and Bruno Ribeiro and Ninghui Li}, year={2026}, eprint={2601.02751}, archivePrefix={arXiv}, primaryClass={cs.CL}, url={https://arxiv.org/abs/2601.02751}, }