Skip to content

fletcherjiang/LLMEPET

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

36 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

Prior Knowledge Integration via LLM Encoding and Pseudo-Event Regulation for Video Moment Retrieval

PWC PWC PWC

Yiyang Jiang, Wengyu Zhang, Xulu Zhang, Xiao-Yong Wei, Chang Wen Chen, and Qing Li.

arXiv License

Official Pytorch Implementation of 'Prior Knowledge Integration via LLM Encoding and Pseudo-Event Regulation for Video Moment Retrieval'

Installation | Dataset | Training | Evaluation | Model Zoo

πŸ“’ News

[2024.7.21] Our paper has been accepted by ACM Multimedia 2024 (Oral).

[2024.7.10] The code and dataset of related tasks has been released.

[2024.5.10] The repository is public.

[2024.4.10] The repository is created.

βš™οΈ Installation

  1. Clone the repository from GitHub.
git clone https://github.com/fletcherjiang/LLMEPET.git cd LLMEPET
  1. Create conda environment.
conda create -n LLMEPET python=3.8 conda activate LLMEPET
  1. Download the packages
pip install -r requirements.txt

πŸ—‚οΈ Dataset

For all datasets, we provide extracted features, download them and place them into features/

The prepared dataset should be in the following structure.

. β”œβ”€β”€ LLMEPET β”‚Β Β  β”œβ”€β”€ llm_epet β”‚Β Β  └── data β”‚Β Β  └── results β”‚Β Β  └── run_on_video β”‚Β Β  └── standalone_eval β”‚Β Β  └── utils β”œβ”€β”€ data β”œβ”€β”€ features β”‚Β Β  └── qvhighlight β”‚   └── charades β”‚Β Β  └── tacos β”‚   └── tvsum β”‚ Β  └── youtube_uni β”œβ”€β”€ llama β”‚   └── consolidated.00.pth β”‚   └── tokenizer.model β”‚   └── params.json β”œβ”€β”€README.md └── Β·Β·Β· 

πŸͺ LLaMA Checkpoint

If you want to try LLaMA-2 or LLaMA-3, you could download the checkpoints from LLaMA-2 or LLaMA-3. You should edit the (llm_epet/llama.py) by yourself.

πŸš€ Training

QVHighlights Training

bash llm_epet/scripts/train.sh 

Charades-STA

bash llm_epet/scripts/charades_sta/train.sh 

TACoS

bash llm_epet/scripts/tacos/train.sh 

TVSum

bash llm_epet/scripts/tvsum/train_tvsum.sh 

Youtube-hl

bash llm_epet/scripts/youtube_uni/train.sh 

⭐ QVHighlights Evaluation and Submission

bash llm_epet/scripts/inference.sh results/{direc}/model_best.ckpt 'val' bash llm_epet/scripts/inference.sh results/{direc}/model_best.ckpt 'test' 

Pack the hl_{val,test}_submission.jsonl files and submit them to CodaLab.

πŸ“¦ Model Zoo

Dataset Model file
QVHighlights (Slowfast + CLIP) checkpoints
Charades (Slowfast + CLIP) checkpoints
TACoS checkpoints
TVSum checkpoints
Youtube-HL checkpoints

πŸ“– Citation

If you find the repository or the paper useful, please use the following entry for citation.

@inproceedings{ jiang2024prior, title={Prior Knowledge Integration via {LLM} Encoding and Pseudo Event Regulation for Video Moment Retrieval}, author={Yiyang Jiang and Wengyu Zhang and Xulu Zhang and Xiaoyong Wei and Chang Wen Chen and Qing Li}, booktitle={ACM Multimedia 2024}, year={2024}, url={https://arxiv.org/abs/2407.15051} } 

About

[MM'24 Oral] Prior Knowledge Integration via LLM Encoding and Pseudo Event Regulation for Video Moment Retrieval

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors