I would like to know the difference between PyTorch Parameter and Tensor?
The existing answer is for the old PyTorch where variables are being used?
I would like to know the difference between PyTorch Parameter and Tensor?
The existing answer is for the old PyTorch where variables are being used?
This is the whole idea of the Parameter class (attached) in a single image.
Since it is sub-classed from Tensor it is a Tensor.
But there is a trick. Parameters that are inside of a module are added to the list of Module parameters. If m is your module m.parameters() will hold your parameter.
Here is the example:
class M(nn.Module): def __init__(self): super().__init__() self.weights = nn.Parameter(torch.randn(2, 2)) self.bias = nn.Parameter(torch.zeros(2)) def forward(self, x): return x @ self.weights + self.bias m=M() m.parameters() list(m.parameters()) --- [Parameter containing: tensor([[ 0.5527, 0.7096], [-0.2345, -1.2346]], requires_grad=True), Parameter containing: tensor([0., 0.], requires_grad=True)] You see how the parameters will show what we defined. And if we just add a tensor inside a class, like self.t = Tensor, it will not show in the parameters list. That is literally it. Nothing fancy.
Adding to @prosti's answer, a nn.Module class, doesn't always explicitly knows what Tensor objects it should optimize for. If you go through this simple commented piece of code, it could clarify it further.
import torch from torch import nn # Simple Objective : Learn a function that maps [1,1] -> [0,0] x = torch.ones(2) # input tensor y = torch.zeros(2) # expected output # Model 1 class M1(nn.Module): def __init__(self): super().__init__() self.weights = nn.Parameter(torch.randn(2, 2)) self.bias = nn.Parameter(torch.zeros(2)) def forward(self, x): return x @ self.weights + self.bias # Model 2 class M2(nn.Module): def __init__(self): super().__init__() # though the Tensor Objects below can undergo backprop and minimize some loss # our model class doesn't know, it should use these tensors during optimization self.weights = torch.randn(2,2).requires_grad_(True) self.bias = torch.zeros(2).requires_grad_(True) def forward(self, x): return x @ self.weights + self.bias m1=M1() m2 = M2() # Bunch of parameters get printed print('Model 1 params : ') print(list(m1.parameters())) # This is empty, meaning, there is no parameter for model to optimize # In the forward pass, model just knows to use these # `weight` and `bias` tensor to do some operations over the input. # But model doesn't know, it should optimize over those `weight` and `bias` tensors objects print('Model 2 params : ') print(list(m2.parameters())) # Initialize the loss function loss_fn = nn.MSELoss(reduction='mean') ## ===== Training ===== ## # Trainer def train_loop(model, loss_fn=loss_fn): # Simple optimizer optimizer = torch.optim.SGD(model.parameters(), lr=0.1) for i in range(5): # Compute prediction and loss pred = model(x) loss = loss_fn(pred, y) # Backpropagation optimizer.zero_grad() loss.backward() optimizer.step() print(f"loss > {loss.item()}") # ====== Train Model 1 ====== # # loss will keep on decreasing, as model_1 finds better weights for train_loop( m1 ) # ====== Trying to Train Model 2 ====== # # Code breaks, at this line : optimizer = torch.optim.SGD(model.parameters(), lr=0.01) # Reason being, that there is no any parameters to optimize for. train_loop( m2 ) For further clarification, check out this short blog implementing pytorch's nn.Linear module.