Skip to main content

You are not logged in. Your edit will be placed in a queue until it is peer reviewed.

We welcome edits that make the post easier to understand and more valuable for readers. Because community members review edits, please try to make the post substantially better than how you found it, for example, by fixing grammar or adding additional resources and hyperlinks.

Required fields*

3
  • Thanks for the answer. So how do I figure out the step size? Is there a rule of thumb which I can use? Commented Feb 9, 2020 at 4:48
  • There is a whole lot of literature on this and many techniques you can use, here are a few: onmyphd.com/?p=gradient.descent Commented Feb 9, 2020 at 13:36
  • A very simple one is to take the step size as an hyperparameters to tune and use a good range of them to train model. Then you validate which model learned the best using cross validation. Commented Feb 9, 2020 at 13:38