Questions tagged [gradient-descent]
"Gradient descent is a first-order optimization algorithm. To find a local minimum of a function using gradient descent, one takes steps proportional to the negative of the gradient (or of the approximate gradient) of the function at the current point."
1,048 questions
0 votes
0 answers
59 views
SGD convergence when visit basin of attraction infinitely often
Consider a discrete stochastic system with components $(x_k, y_k)$ updated as follows. If all components are strictly positive, i.e. $x_k > 0$, $y_k > 0$, then \begin{aligned} x_{k+1} &= x_k ...
2 votes
0 answers
372 views
Length bound for a ray lift-off from a critical point of a polynomial
Question. Would anyone know of some mathematical tool(s) which can be used to tackle the following question? Given the $N$ distinct connected branches of $P(z)=\prod_{k=1}^N (z-\alpha_k)$ with $|\...
2 votes
1 answer
143 views
Deriving adjoint equation of Poisson equation using Lagrange multiplier
Suppose we consider the optimisation problem as follows: \begin{align} \min_{\sigma} &\quad J(\sigma)=\frac{1}{2}\sum_{k=1}^{m}(\phi_k(x_{r,k})-V_{k})^2\\ \text{subject to }&-\nabla\...
0 votes
0 answers
53 views
How to perform gradient descent when there is large variation in the magnitude of the gradient in different directions near the minimum?
Suppose we wish to minimize a function $f(\vec{x})$ via the gradient descent algorithm \begin{equation} \vec{x}_{n+1} = \vec{x}_n - \eta \vec{\nabla}f(\vec{x}_n) \end{equation} starting from some ...
0 votes
0 answers
37 views
Numeric derivative computation in a recursive one-step predictor way
I have been thinking about recursive system identification lately (I've developed some quasi-recursive identification algorithms myself in the past) and I have a question that has arised. Consider we ...
0 votes
0 answers
23 views
$q$-linear convergence of Preconditioning Gradient Descent
Let $f: \mathbb{R}^n \rightarrow \mathbb{R}$ be a $C^2$ function and assume that it’s $\mu$-strongly convex. Let $P$ be a preconditioning matrix. Let $x^{*}$ be the unique minimizer of $\min_{x} f(x)$ ...
0 votes
0 answers
70 views
Looking for a gradient in Batch Normalization
I'm studying Batch Normalization inside a neural network where the output is $$ y_i = \gamma \hat{x}_i + \beta, $$ with $$ \hat{x}_i = \frac{x_i - \mu}{\sqrt{\sigma}}, $$ and $$ \mu = \frac{1}{m} \...
0 votes
1 answer
60 views
Lipschitz Constant for Constrained Least Squares Sigmoid Regression
I am working on a constrained least squares sigmoid regression problem and would like to determine the Lipschitz constant of the objective function. The optimization problem is: \begin{equation} \...
0 votes
0 answers
27 views
Convergence Analysis of Preconditioned GD
Problem: Let $f: \mathbb{R}^n \rightarrow \mathbb{R}$ be a $C^2$ function. And suppose that for any $x,y \in \mathbb{R}^n$, $\|\nabla f(x) - \nabla f(y)\| \leq \|x-y\|$. We let $P$ be a ...
0 votes
0 answers
19 views
Implementing Temporal Differencing with Binary Cross Entropy
I'm working on a chess engine which I am building from scratch (from NumPy). I used deep learning with BCE and games generated by a basic "engine" which gives a score to each move (checks, ...
1 vote
1 answer
64 views
Steepest descent along a direction not normal to level curve?
I undestand using equations why the steepest descent is along the gradient vector. But I have this diagram of level surfaces I jotted down: So, in this diagram, the distance to the surface $y=c+1$ is ...
0 votes
1 answer
65 views
State of the art constrained optimization methods
Say I want to maximize a concave, non-differentiable function $f(\mathbf{x})$ subject to $\mathbf x \in C$ where $C$ is a convex set. We assume that the projecting to $C$ is easy to compute and the ...
1 vote
1 answer
63 views
Convergence of gradient under Armijo condition
Let $f: \mathbb{R}^n \rightarrow \mathbb{R}$ be a $C^2$ function. We consider gradient descent. We have $x_{k+1} = x_{k} + \alpha_{k}p_{k}$, where $p_{k}$ is the descent direction. We have the ...
4 votes
3 answers
343 views
What is the best way to traverse 3D space in discrete steps to find the absolute maximum of a function in a bounded region?
I'm an engineer and I've recently moved from a Hardware Design position to one in Software, which has meant some new pure Computer Science problems have landed on my desk, and a few that are purely ...
0 votes
0 answers
39 views
Minimizer and First Order Properties for Numerical Optimization
I need to establish the following properties of $f_t$. For reference, I am taking a Numerical Optimization class and this is one of the Exercises. There are three things we need to establish. Define ...