I create a NN. I'm having a problem with recounting gradients. The problem is that I scalarly multiply 2 tensors u @ v and normalize one of them. It is important that gradients cannot be calculated for h. Therefore, I use detach(). In addition, during the recalculation of gradients, normalization should not be taken into account (I do not know how to do this).
import torch from torch import nn class Nn(nn.Module): def __init__(self): super(Nn, self).__init__() self.ln = nn.Linear(5, 5) def forward(self, x): v = self.ln(x) u = v.clone() h = v.clone() u /= u.norm() h = h.detach() h /= h.norm() res = torch.stack([torch.stack([u @ h, u @ h])]) return res def patches_generator(): while True: decoder = torch.rand((5, )) target = torch.randint(2, (1,)) yield decoder, target net = Nn() criterion = nn.CrossEntropyLoss() optimizer = torch.optim.Adam(net.parameters()) net.train() torch.autograd.set_detect_anomaly(True) for decoder, targets in patches_generator(): optimizer.zero_grad() outputs = net(decoder) loss = criterion(outputs, targets) loss.backward() optimizer.step() As a result, I get the following error:
RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation: [torch.FloatTensor [9, 512, 1, 1]], which is output 0 of ReluBackward1, is at version 3; expected version 2 instead. Hint: the backtrace further above shows the operation that failed to compute its gradient. The variable in question was changed in there or anywhere later. Good luck!
ReluBackward1come from?