Neural Net not able to learn simple analytical equation

Question

I am currently making my first attempts with Pytorch. I am trying to solve a simple equation with a neural net. Analytically solved, the result of my neural net shall look like this: $$ y = \frac{x_5}{x_2} - \frac{x_1-x_2}{2 x_3 x_4}\frac{x_2}{x_1} $$ while I am randomly generating input data with the following boundaries: $$500 \leq x_1 \leq 1000 \\ 1 \leq x_2 \leq x_1 \\ 1 \leq x_5 \leq 10000 \\ 1e-6\leq x_4 \leq 1e-3 \\ 1000 \leq x_3 \leq 50000 $$ Now the neural network does not seem to be able to learn the function. I assume this is due to the large spreads of the input areas of the function. I have already tried it with various network architectures and different activation functions. I have also tried an lr scheduler and varied the learning rate from 1e-2-1e-8. I have also tried to summarize the input variables $x_3x_4$ as a separate variable.

Since I would like to add additional equations to that neural network to be solved later on, I implemented the following, normalized, loss function:

F.l1_loss((y_est+(x_1-x_2)/(2*x_3*x_4)*x_2/x_1 - x_5/x_2)/(x_5/x_2),torch.FloatTensor(np.zeros(batch_size)))

I mainly worked here with nn of multiple sizes and layers, but couldnt get good results (mainly losses over 0.5, but bouncing up to 4ish)

Thanks in advance.

Edit:

import torch.optim as optim import numpy as np import torch import torch.nn as nn import torch.nn.functional as F import random from torch.optim.lr_scheduler import ReduceLROnPlateau def get_batch(batch_size=32): batch_x = [] for i in range(batch_size): x_1 = random.uniform(500,1000) x_2 = random.uniform(1,x_1-1) x_3 = random.uniform(1,10000) x_4 = random.uniform(1e-6,0.001) x_5 = np.random.randint(1000,50000) batch_x.append([x_1,x_2,x_3,x_4,x_5]) return torch.FloatTensor(batch_x).cuda().contiguous() class CustomNet(nn.Module): def __init__(self,n_input,n_output,n_hidden_neurons,n_hidden_layers): super(CustomNet, self).__init__() self.sequential_layers = nn.Sequential( *((nn.Linear(n_input,n_hidden_neurons),)+ tuple(nn.Linear(n_hidden_neurons,n_hidden_neurons) for i in range(n_hidden_layers))) ) self.fc1 = nn.Linear(n_hidden_neurons,n_output) def forward(self, x): x = torch.log(x) output = self.sequential_layers(x) output = torch.exp(output) output = self.fc1(output) return output batch_size = 1000 n_epochs = 50000 n_input,n_output,n_hidden_neurons,n_hidden_layers = 5,1,16,1 # create model nnet = CustomNet(n_input,n_output,n_hidden_neurons,n_hidden_layers) nnet = nnet.to("cuda") lr, counter = 0.01, 0 optimizer = optim.Adam(nnet.parameters(),lr=lr) target = torch.FloatTensor(np.zeros(batch_size)).contiguous().cuda() lr_scheduler = ReduceLROnPlateau(optimizer, mode='min', factor=0.1, patience=1000, verbose=True) target = torch.FloatTensor(np.zeros(batch_size)).cuda().contiguous() for i in range(n_epochs): # Reset gradients nnet.zero_grad() batch_x = get_batch(batch_size) x_1,x_2,x_3,x_4,x_5 = batch_x.T y_est = nnet(batch_x) y_est = y_est[:,0] output = F.l1_loss((y_est+(x_1-x_2)/(2*x_3*x_4)*x_2/x_1 - x_5/x_2)/(x_5/x_2),target) # Backward pass loss = output.item() output.backward() optimizer.step() lr_scheduler.step(output)

Have you tried first overfitting on just some training samples? Additionally, it would help if you could provide a full code example (i.e. generating the training data as well as creating and training the model) as this would help us better diagnose any possible issues. — Oxbowerce
– Oxbowerce, Commented Nov 17, 2023 at 18:33
I've updated the question with a code example. Thanks for your help. — Ripleys
– Ripleys, Commented Nov 17, 2023 at 19:03

Sheng Yang · Accepted Answer · 2024-01-07 15:51:41Z

A couple of suggestions:

add nonlinearity between hidden layers and increase the number of hidden layers: currently self.sequential_layers is just a stacked linear transformation. Add ReLU between consecutive hidden layer should help you learn quicker;
not sure if torch.log and torch.exp is a good idea. You may remove the torch.log and replace torch.exp with your choice of nonlinearity for bridging the hidden layers;
use smaller batch sizes: say 64 or 128.

Stack Exchange Network

Neural Net not able to learn simple analytical equation

1 Answer 1

Hot Network Questions

Neural Net not able to learn simple analytical equation

1 Answer 1

Related

Hot Network Questions