To achieve state of the art, or even merely good, results, you have to have to have set up all of the parts configured to work well together. Setting up a neural network configuration that actually learns is a lot like picking a lock: all of the pieces have to be lined up just right. Just as it is not sufficient to have a single tumbler in the right place, neither is it sufficient to have only the architecture, or only the optimizer, set up correctly.