3
$\begingroup$

Neural networks are usually trained with first order gradient methods and it's variations such as: batch gradient descent, stochastic gradient descent, momentum based gradient and so on.. However those are at least superfitially very simple first order aproximations and I don't know how well they perform in very deep ANN architectures or more complex models. In those cases are the first order gradient method still used or are there more effitient optimization methods for neural nets?

$\endgroup$
1
  • 5
    $\begingroup$ SGD and variants are still the most widely used $\endgroup$ Commented Aug 12, 2020 at 23:21

2 Answers 2

2
$\begingroup$

There are endless optimization algorithms getting published, they all fundamentally do the same thing they try to go beyond simple gradients spend more compute on each step for fewer steps and claim to outperform.

In reality essentially all of these papers actually fail. They tune the hyper paramaters properly for their new method use defaults for the competitor and and show superior performance.

If you look what people are actually using you see overwhelmingly the same old ADAM variants. So I believe what is actually being used to train SOTA models is a better measurement than the various papers which supposedly measured in a controlled fashion.

$\endgroup$
1
  • 2
    $\begingroup$ Most papers that use default hyper-parameter settings (without explicit justification) should be a desk-reject. I co-wrote a paper on this issue some time ago, but the reviewers didn't like it ( arxiv.org/pdf/1703.06777 ). I suspect this issue is worse for DNNs and large data settings because training the model (and therefore hyper-parameter selection and unbiased performance evaluation) becomes very expensive. $\endgroup$ Commented Nov 22 at 10:25
0
$\begingroup$

The state-of-the-art basically still always uses Adam or RMS-prop.

A recent significant SGD-variant that has come up is Adafactor, which is similar to Adam, but uses much, much less memory.

$\endgroup$

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.