why not just call the 'module' a model, and call the layers 'layers'?
Recall in data structure course, you define binary tree like this
class tree: def __init__(self, value, left, right): self.value = value self.left = left self.right = right
you can add sub tree or leaf to a tree to form a new tree, just like you can add sub module to module to form a new module(you don't want to sub tree and tree two different data structure, you don't want leaf and tree two different data structure, because after all they are all tree, you want to use module to represent both model and layers ... think it as recursive, it is a API design choice to make things clean.)
What exactly is the definition of a 'Module' in PyTorch?
I would like to think module as something takes input and output something, just like a function... that's what forward method in module class do(specify what the function is), and you need to overwrite default forward method because otherwise pytorch would not know what the function is...
def forward(self, x): x = F.relu(self.conv1(x)) x = F.max_pool2d(x, 2, 2) x = F.relu(self.conv2(x)) x = F.max_pool2d(x, 2, 2) x = x.view(-1, 4*4*50) x = F.relu(self.fc1(x)) x = self.fc2(x) return F.log_softmax(x, dim=1)
Another example is nn.sequential, it is also a module, but a special one, it takes a list a of module and chains the input and output of these modules together.
nn.sequential(a, b c) # a->b->c
that's why you do not need to specify a forward method, because it is specified implicitly(just take output of a former module and feed to next module).
Another example is conv2d, it is also a module, and its forward method is also defined already so you don't need to specify it...
class _ConvNd(Module): # omit class Conv2d(_ConvNd): def __init__(self, in_channels, out_channels, kernel_size, stride=1, padding=0, dilation=1, groups=1, bias=True, padding_mode='zeros'): kernel_size = _pair(kernel_size) stride = _pair(stride) padding = _pair(padding) dilation = _pair(dilation) super(Conv2d, self).__init__( in_channels, out_channels, kernel_size, stride, padding, dilation, False, _pair(0), groups, bias, padding_mode) def conv2d_forward(self, input, weight): if self.padding_mode == 'circular': expanded_padding = ((self.padding[1] + 1) // 2, self.padding[1] // 2, (self.padding[0] + 1) // 2, self.padding[0] // 2) return F.conv2d(F.pad(input, expanded_padding, mode='circular'), weight, self.bias, self.stride, _pair(0), self.dilation, self.groups) return F.conv2d(input, weight, self.bias, self.stride, self.padding, self.dilation, self.groups) def forward(self, input): return self.conv2d_forward(input, self.weight)
also if anyone wonder how pytorch builds a graph and do back propagation...
check this out... (plz do not take this code seriously since I am not sure if this is how pytorch implement... but take the idea with you, it may help you understand how pytorch works)
some silly code Hope this helps :)
PS, I am new to deep learning and pytorch. It's likely this may contain some mistakes, read carefully...