3

I am new to Neural Networks.I was trying to write simple 4-0-2 MLP and learn back-propagation algorithm in practice. But my back-propagation always diverges and the output is always [1,1]. I searched for the possible cause but neither setting learning rate to quite small number(0.001) nor changing the sign of delta weight could solve the problem.

Code for back-propagation algorithm:

def backward(self,trainingSamples): for i in range(len(trainingSamples)): curr_sample=trainingSamples[i] self.input=curr_sample[0] self.forward() print("output is "+str(self.output)) curr_des_out=curr_sample[1] for i in range(len(self.outputs)): error=curr_des_out[i]-self.outputs[i].output der_act=self.outputs[i].activate(deriv=True) local_gradient=der_act*error for j in range(len(self.input)): self.weights[j][i]-=self.learning_rate*local_gradient*self.input[j] 

and trainingSamples is a tuple of tuples of arrays:( ([1,1,1,1],[1,0]), ([0,0,0,0],[0,1]),([1,0,0,0],[0,1]), ([1,0,1,0],[1,0]) )

Here is the forward pass code:

def forward(self): for i in range(len(self.outputs)): for j in range(len(self.input)): self.outputs[i].input+=self.input[j]*self.weights[j][i] self.outputs[i].activate() self.output[i]=self.outputs[i].output return self.output 
11
  • 1
    Can you show how you calculate the output? Perhaps there is something wrong with the forward (as well)? (I don't want to offend you, but the more we can exclude, the better I think). Commented Jan 24, 2017 at 19:24
  • @david_l: perhaps you better edit it in your question. Commented Jan 24, 2017 at 19:46
  • @WillemVanOnsem,done Commented Jan 24, 2017 at 19:48
  • Are you sure you "reset" (set to 0) you output[i].inputs before you call? Because here you keep adding up. Commented Jan 24, 2017 at 19:53
  • @WillemVanOnsem,outputs is an arrray of output neurons,and their input is equal to the network input,`cos there is no hidden layer,so it is changing with every call. Commented Jan 24, 2017 at 19:58

1 Answer 1

1

Althoug I cannot see the full implementation of your code (things like .activate(), etc. I think I have an idea how you have implemented them. Given you have implemented them correctly I see one problem with your code that will clearly show divergence.

The problem - or at least one of the problems - seems to be that you do not reset the input (dendrite) of your neurons:

def forward(self): for i in range(len(self.outputs)): self.outputs[i].input = 0 for j in range(len(self.input)): self.outputs[i].input+=self.input[j]*self.weights[j][i] self.outputs[i].activate() self.output[i]=self.outputs[i].output return self.output

Because you keep incrementing the input, I suspect that indeed you eventually end up with output [1,1] since the sigmoid function goes to 1 as its input goes to infinity.

Sign up to request clarification or add additional context in comments.

3 Comments

I use sigmoid as activation function.
@david_I: yes but I assume the sigmoid does not clear the input of the session?
if w and x shape are different(like conv layer or other layer), thus dw and dx shape are different, how can we element-wise top-layer's dx multiply by the previous-layer's dw that affect the change of w of this previous layer.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.