I am currently making my first attempts with Pytorch. I am trying to solve a simple equation with a neural net. Analytically solved, the result of my neural net shall look like this: $$ y = \frac{x_5}{x_2} - \frac{x_1-x_2}{2 x_3 x_4}\frac{x_2}{x_1} $$ while I am randomly generating input data with the following boundaries: $$500 \leq x_1 \leq 1000 \\ 1 \leq x_2 \leq x_1 \\ 1 \leq x_5 \leq 10000 \\ 1e-6\leq x_4 \leq 1e-3 \\ 1000 \leq x_3 \leq 50000 $$ Now the neural network does not seem to be able to learn the function. I assume this is due to the large spreads of the input areas of the function. I have already tried it with various network architectures and different activation functions. I have also tried an lr scheduler and varied the learning rate from 1e-2-1e-8. I have also tried to summarize the input variables $x_3x_4$ as a separate variable.
Since I would like to add additional equations to that neural network to be solved later on, I implemented the following, normalized, loss function:
F.l1_loss((y_est+(x_1-x_2)/(2*x_3*x_4)*x_2/x_1 - x_5/x_2)/(x_5/x_2),torch.FloatTensor(np.zeros(batch_size))) I mainly worked here with nn of multiple sizes and layers, but couldnt get good results (mainly losses over 0.5, but bouncing up to 4ish)
Thanks in advance.
Edit:
import torch.optim as optim import numpy as np import torch import torch.nn as nn import torch.nn.functional as F import random from torch.optim.lr_scheduler import ReduceLROnPlateau def get_batch(batch_size=32): batch_x = [] for i in range(batch_size): x_1 = random.uniform(500,1000) x_2 = random.uniform(1,x_1-1) x_3 = random.uniform(1,10000) x_4 = random.uniform(1e-6,0.001) x_5 = np.random.randint(1000,50000) batch_x.append([x_1,x_2,x_3,x_4,x_5]) return torch.FloatTensor(batch_x).cuda().contiguous() class CustomNet(nn.Module): def __init__(self,n_input,n_output,n_hidden_neurons,n_hidden_layers): super(CustomNet, self).__init__() self.sequential_layers = nn.Sequential( *((nn.Linear(n_input,n_hidden_neurons),)+ tuple(nn.Linear(n_hidden_neurons,n_hidden_neurons) for i in range(n_hidden_layers))) ) self.fc1 = nn.Linear(n_hidden_neurons,n_output) def forward(self, x): x = torch.log(x) output = self.sequential_layers(x) output = torch.exp(output) output = self.fc1(output) return output batch_size = 1000 n_epochs = 50000 n_input,n_output,n_hidden_neurons,n_hidden_layers = 5,1,16,1 # create model nnet = CustomNet(n_input,n_output,n_hidden_neurons,n_hidden_layers) nnet = nnet.to("cuda") lr, counter = 0.01, 0 optimizer = optim.Adam(nnet.parameters(),lr=lr) target = torch.FloatTensor(np.zeros(batch_size)).contiguous().cuda() lr_scheduler = ReduceLROnPlateau(optimizer, mode='min', factor=0.1, patience=1000, verbose=True) target = torch.FloatTensor(np.zeros(batch_size)).cuda().contiguous() for i in range(n_epochs): # Reset gradients nnet.zero_grad() batch_x = get_batch(batch_size) x_1,x_2,x_3,x_4,x_5 = batch_x.T y_est = nnet(batch_x) y_est = y_est[:,0] output = F.l1_loss((y_est+(x_1-x_2)/(2*x_3*x_4)*x_2/x_1 - x_5/x_2)/(x_5/x_2),target) # Backward pass loss = output.item() output.backward() optimizer.step() lr_scheduler.step(output)