The purpose of introducing nn.Parameter in pytorch

Question

I am new to Pytorch and I am confused about the difference between nn.Parameter and autograd.Variable. I know that the former one is the subclass of Variable and has the gradient. But I really don't understand why we introduce Parameter and when we should use it?

SUMMARY:

Thanks for the explanation of iacolippo, i finally understand the difference between parameter and variable. In a summary, variable in pytorch is NOT same as in the variable in tensorflow, the former one is not attach to the model's trainable parameters while the later one will. Attaching to the model means that using model.parameters() will return the certain parameter to you, which is useful in training phase to specify the variable needed to train. The 'variable' is more helpful as a cache in some network.

@Paddy Hi paddy, i still have Variable after i switched to version 0.4.0 — FesianXu
– FesianXu, Commented Jul 17, 2018 at 6:56

iacolippo · Accepted Answer · 2018-07-19 07:53:37Z

From the documentation:

Parameters are Tensor subclasses, that have a very special property when used with Modules - when they’re assigned as Module attributes they are automatically added to the list of its parameters, and will appear e.g. in parameters() iterator. Assigning a Tensor doesn’t have such effect. This is because one might want to cache some temporary state, like last hidden state of the RNN, in the model. If there was no such class as Parameter, these temporaries would get registered too.

Think for example when you initialize an optimizer:

optim.SGD(model.parameters(), lr=1e-3)

The optimizer will update only registered Parameters of the model.

Variables are still present in Pytorch 0.4 but they are deprecated. From the docs:

The Variable API has been deprecated: Variables are no longer necessary to use autograd with tensors. Autograd automatically supports Tensors with requires_grad set to True.

Pytorch pre-0.4

In Pytorch before version 0.4 one needed to wrap a Tensor in a torch.autograd.Variable in order to keep track of the operations applied to it and perform differentiation. From the docs of Variable in 0.3:

Wraps a tensor and records the operations applied to it. Variable is a thin wrapper around a Tensor object, that also holds the gradient w.r.t. to it, and a reference to a function that created it. This reference allows retracing the whole chain of operations that created the data. If the Variable has been created by the user, its grad_fn will be None and we call such objects leaf Variables. Since autograd only supports scalar valued function differentiation, grad size always matches the data size. Also, grad is normally only allocated for leaf variables, and will be always zero otherwise.

The difference wrt Parameter was more or less the same. From the docs of Parameters in 0.3:

A kind of Variable that is to be considered a module parameter. Parameters are Variable subclasses, that have a very special property when used with Modules - when they’re assigned as Module attributes they are automatically added to the list of its parameters, and will appear e.g. in parameters() iterator. Assigning a Variable doesn’t have such effect. This is because one might want to cache some temporary state, like last hidden state of the RNN, in the model. If there was no such class as Parameter, these temporaries would get registered too.

Another difference is that parameters can’t be volatile and that they require gradient by default.

Thanks, now i know that the parameter in pytorch acts like variable in tensorflow. And, just for curiosity, why pytorch before version 0.4.0 introduced the parameter? is there any difference between parameter and variable before version 0.4.0?
i got it. It seems that 'variable' is useful, then why it will be deprecated?
It is deprecated because Tensor now has that behavior, without the need to wrap it in another class

Collectives™ on Stack Overflow

The purpose of introducing nn.Parameter in pytorch

1 Answer 1

Pytorch pre-0.4

3 Comments

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Pytorch pre-0.4

3 Comments

Related