I have the following setup:
- 2 input neurons (I1, I2)
- 2 output neurons (O1, O2)
- 1 hidden layer with 3 neurons (H1, H2, H3)
- loss function = mse
- optimizer = Adam
- the values from I1 range from 0 - 100
- the values from I2 range from 0 - 500
- batch size = 16
- learning rate = 0.1
The ANN should learn the following rules (regression problem):
- If I1 increasing O1 decreasing
- If I1 increasing O2 increasing
- If I2 increasing O1 constant
- If I2 increasing O1 decreasing
I am using the following model:
class DQN(nn.Module): def __init__(self): super().__init__() self.fc1 = nn.Linear(in_features=2, out_features=3) self.out = nn.Linear(in_features=3, out_features=2) def forward(self, t): t = F.relu(self.fc1(t)) t = self.out(t) return t However, it does not learn. My question is, which part should I focus on? Is a linear, fully connected network maybe not suitable for the rules (no linear regression)? Do I need more hidden layers / more neurons? Is the learning rate a problem? Should the input data be normalized?
I tinkered around a lot, but didn't get any improvements.