1

When I wanna assign part of pre-trained model parameters to another module defined in a new model of PyTorch, I got two different outputs using two different methods.

The Network is defined as follows:

class Net: def __init__(self): super(Net, self).__init__() self.resnet = torch.hub.load('pytorch/vision', 'resnet18', pretrained=True) self.resnet = nn.Sequential(*list(self.resnet.children())[:-1]) self.freeze_model(self.resnet) self.classifier = nn.Sequential( nn.Dropout(), nn.Linear(512, 256), nn.ReLU(), nn.Linear(256, 3), ) def forward(self, x): out = self.resnet(x) out = out.flatten(start_dim=1) out = self.classifier(out) return out 

What I want is to assign pre-trained parameters to classifier in the net module. Two different ways were used for this task.

# First way net.load_state_dict(torch.load('model_CNN_pretrained.ptl')) # Second way params = torch.load('model_CNN_pretrained.ptl') net.classifier[1].weight = nn.Parameter(params['classifier.1.weight'], requires_grad =False) net.classifier[1].bias = nn.Parameter(params['classifier.1.bias'], requires_grad =False) net.classifier[3].weight = nn.Parameter(params['classifier.3.weight'], requires_grad =False) net.classifier[3].bias = nn.Parameter(params['classifier.3.bias'], requires_grad =False) 

The parameters were assigned correctly but got two different outputs from the same input data. The first method works correctly, but the second doesn't work well. Could some guys point what the difference of these two methods?

2
  • 2
    net.classifier[4].bias - is it intentionally [4] instead of [3]? or just a typo in your question? Commented Dec 3, 2020 at 15:11
  • Sorry, It's just a typo. Commented Dec 4, 2020 at 1:07

1 Answer 1

1

Finally, I find out where is the problem.

During the pre-trained process, buffer parameters in BatchNorm2d Layer of ResNet18 model were changed even if we set require_grad of parameters False. Buffer parameters were calculated by the input data after model.train() was processed, and unchanged after model.eval().

There is a link about how to freeze the BN layer.

How to freeze BN layers while training the rest of network (mean and var wont freeze)

Sign up to request clarification or add additional context in comments.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.