A Deep Multi-modal Explanation Model for Zero-shot Learning

This repository is the PyTorch implementation for the paper A Deep Multi-modal Explanation Model for Zero-shot Learning in IEEE Transactions on Image Processing (TIP), 2020.

In this project, we provide the data, source codes and Grad-CAM visualization.

Dependencies

PyTorch
Python

Data

Download the data from here.

Note that, we extract new visual features from ResNet-101 instead of using the features from previous works.

For each image, we extract one visual feature without using any data augmentation like crop and flip, because the data augmentation will affect the correct alignment of visual explanations afterwards.

Train and Test

Run DME.py to train the visual-semantic embedding module.
Run DME_joint.py to train the textual explanation module.
Run .\Grad-CAM\gradcam_resnet101.py to generate the visual explanation.

Notes

This repo is based on the codebase of f-CLSWGAN

More instructions will be provided later.

Name		Name	Last commit message	Last commit date
Latest commit History 17 Commits
Grad-CAM		Grad-CAM
DME.py		DME.py
DME_joint.py		DME_joint.py
README.md		README.md
diversity_consistency.png		diversity_consistency.png
model_New.py		model_New.py
util.py		util.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

A Deep Multi-modal Explanation Model for Zero-shot Learning

Dependencies

Data

Train and Test

Notes

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

A Deep Multi-modal Explanation Model for Zero-shot Learning

Dependencies

Data

Train and Test

Notes

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages