I am reading Kingma and Lei Ba's paper introducing the Adam optimizer. I was looking over some derivations for the second moment estimate:
I noticed that they find the sum of a finite geometric series from the second to third equation in the image. The equation for finding the sum is:
But they don't seem to multiply by the first term of the series. What am I missing? If this is some kind of approximation, why is it allowed/favorable?

