"can't optimize a non-leaf Tensor" on torch Parameter

Question

I'm getting "can't optimize a non-leaf Tensor" on this bit of code

self.W_ch1 = nn.Parameter( torch.rand(encoder_feature_dim, encoder_feature_dim), requires_grad=True ).to(self.device) self.W_ch1_optimizer = torch.optim.Adam([self.W_ch1], lr=encoder_lr)

Don't know why it's happening that should be the leaf tensor, because it has no children connected to it. It's just a torch.rand inside a nn.Parameter variable. It throws the error at the initialization of self.w_ch1_optmizer

Can you show the code where you are performing inference and backpropagation? — Ivan
– Ivan, Commented Jun 19, 2022 at 22:55
@Ivan actually it throws that error at the optimizer initialization. Not sure why. — Gooby
– Gooby, Commented Jun 19, 2022 at 23:00
The below code works for me. So I don't think there is a problem with this code. ``` W_ch1 = nn.Parameter(torch.rand(10,10), requires_grad=True) W_ch1_optimizer = torch.optim.Adam([W_ch1], lr=1e-3) ``` — Umang Gupta
– Umang Gupta, Commented Jun 19, 2022 at 23:43
@UmangGupta that's so weird. I added a screen shot to show that's exactly where it's breaking. — Gooby
– Gooby, Commented Jun 20, 2022 at 0:26
There may be something else wrong with your code that may be causing this to break. Can you run the two-line code that I wrote in the previous comment and check if you get the error? Also, which torch version? — Umang Gupta
– Umang Gupta, Commented Jun 20, 2022 at 0:32

Ivan · Accepted Answer · 2022-06-20 09:17:19Z

The reason why it throws an error is that torch.Tensor.cuda has the effect of creating a reference for transferring the data doing so by registering a new node in the graph. In other words your parameter module W_ch1 is no longer a leaf node since you already have this "computation" tree:

nn.Parameter -> cuda:parameter = W_ch1

You can compare the following two results:

>>> p = nn.Parameter(torch.rand(1)).cuda() >>> p.is_leaf False

What you need to be doing is first instantiate your modules, and define your optimizer(s). Only then can you transfer them to the desired device. Not before:

>>> p = nn.Parameter(torch.rand(1)) >>> optimizer = optim.Adam([p], lr=lr)

Then you can transfer everything:

>>> p.cuda() >>> optimizer.cuda()

Collectives™ on Stack Overflow

"can't optimize a non-leaf Tensor" on torch Parameter

1 Answer 1

Comments

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Related