Skip to content

xmed-lab/NumCLIP

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

30 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

NumCLIP

This repository contains PyTorch implementation of "Teach CLIP to Develop a Number Sense for Ordinal Regression (ECCV2024)".

Created by Du Yao, Zhai Qiang, Dai Weihang, Li Xiaomeng*

Overview of NumCLIP

The framework of NumCLIP, aiming to teach CLIP to develop a strong number sense for ordinal regression.

intro

Quick Preview

1. Img2Lang Concept

NumCLIP mimics human numerical cognition: mapping an image feature to a language concept first, and then reasoning the number.


This paradigm can be condcuted in a coarse-to-fine manner. From that we elegantly convert an dense regression task into a simple and coarse classification problem, which not only smoothly mitigates the insufficient number caption issue, but also effectively utilises/recalls the pre-trained/available concept alignment learned by CLIP.

2. Cross-modal Ranking-based Feature Regularization

The cross-modal negative samples are pushed away with ordinal label distance alignment.

 def compute_ce_dis_loss(self,logits,y,d): list_target = list(range(d)) target = torch.Tensor(list_target).to('cuda:0') target = torch.unsqueeze(target,1) ls_weight = [] for i in range(len(y)): label_inv_ranks = (torch.abs(y[i] - target).transpose(0,1)) label_inv_ranks_norm = (torch.abs(y[i] - target).transpose(0,1)) / torch.sum(label_inv_ranks,dim=1) * (d-1) label_inv_ranks_norm = torch.squeeze(label_inv_ranks_norm,0) label_inv_ranks_norm[y[i]] = 1.0 ls_label_inv_ranks_norm = label_inv_ranks_norm.detach().cpu().numpy().tolist() ls_weight.append(ls_label_inv_ranks_norm) weight = torch.Tensor(ls_weight).to('cuda:0') logits_weight = logits * weight loss = self.ce_loss_func(logits_weight, y) return loss

Requirements

We utilize the code base of OrdinalCLIP. Please follow their instructions to prepare the environment and datasets.

Model Training

Before training the model, move regclipssr.py to ./ordinalclip/models/, and runner_ssr.py to ./ordinalclip/runner/ accordingly.

Add from . import regclip_ssr in ./ordinalclip/models/__init__.py.

And also change the path in run.py (from ordinalclip.runner.runner_ssr import Runner).

sh scripts/run_regclipssr.sh

What's More

Check out these amazing works leveraging CLIP for number problems!

Citation

If you find this codebase helpful, please consider to cite:

@inproceedings{du2024teach, title={Teach clip to develop a number sense for ordinal regression}, author={Du, Yao and Zhai, Qiang and Dai, Weihang and Li, Xiaomeng}, booktitle={European Conference on Computer Vision}, pages={1--17}, year={2024}, organization={Springer} } 

Releases

No releases published

Packages

 
 
 

Contributors