I'm getting "can't optimize a non-leaf Tensor" on this bit of code
self.W_ch1 = nn.Parameter( torch.rand(encoder_feature_dim, encoder_feature_dim), requires_grad=True ).to(self.device) self.W_ch1_optimizer = torch.optim.Adam([self.W_ch1], lr=encoder_lr) Don't know why it's happening that should be the leaf tensor, because it has no children connected to it. It's just a torch.rand inside a nn.Parameter variable. It throws the error at the initialization of self.w_ch1_optmizer
