Skip to content

ruthmd/knowledge_distillation

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 

Repository files navigation

Experiments related to knowledge distillation

Dataset: CIFAR-10

Accuracy by varying Temperature

Temperature (T) Teacher accuracy (%) Student accuracy without teacher (%) Student accuracy with XE + KD (%) Student accuracy with XE + CosineLoss (%) Student accuracy with XE + RegressorMSE (%)
0.5 75.49 70.31 70.33 70.78 70.94
2.0 75.49 70.31 70.71 70.48 71.20
5.0 75.49 70.31 72.18 70.55 71.08
10.0 75.49 70.31 72.07 70.25 71.08
15.0 75.49 70.31 71.35 70.50 70.90
25.0 75.49 70.31 71.36 70.59 70.52

Accuracy by varying Weight initialization methods

Initialization Method KD Cosine MSE
Default 71.82 69.84 69.74
Xavier 70.85 70.50 69.50
Kaiming 66.23 67.01 65.12
Orthogonal 70.54 70.21 69.78
Teacher 70.08 69.83 69.40

About

Experiments related to knowledge distillation

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors