Questions tagged [gradient-descent]

Question 1

Consider a discrete stochastic system with components $(x_k, y_k)$ updated as follows. If all components are strictly positive, i.e. $x_k > 0$, $y_k > 0$, then \begin{aligned} x_{k+1} &= x_k ...

Question 2

Question. Would anyone know of some mathematical tool(s) which can be used to tackle the following question? Given the $N$ distinct connected branches of $P(z)=\prod_{k=1}^N (z-\alpha_k)$ with $|\...

Question 3

Suppose we consider the optimisation problem as follows: \begin{align} \min_{\sigma} &\quad J(\sigma)=\frac{1}{2}\sum_{k=1}^{m}(\phi_k(x_{r,k})-V_{k})^2\\ \text{subject to }&-\nabla\...

Question 4

Suppose we wish to minimize a function $f(\vec{x})$ via the gradient descent algorithm \begin{equation} \vec{x}_{n+1} = \vec{x}_n - \eta \vec{\nabla}f(\vec{x}_n) \end{equation} starting from some ...

Question 5

I have been thinking about recursive system identification lately (I've developed some quasi-recursive identification algorithms myself in the past) and I have a question that has arised. Consider we ...

Question 6

Let $f: \mathbb{R}^n \rightarrow \mathbb{R}$ be a $C^2$ function and assume that it’s $\mu$-strongly convex. Let $P$ be a preconditioning matrix. Let $x^{*}$ be the unique minimizer of $\min_{x} f(x)$ ...

Question 7

I'm studying Batch Normalization inside a neural network where the output is $$ y_i = \gamma \hat{x}_i + \beta, $$ with $$ \hat{x}_i = \frac{x_i - \mu}{\sqrt{\sigma}}, $$ and $$ \mu = \frac{1}{m} \...

Question 8

I am working on a constrained least squares sigmoid regression problem and would like to determine the Lipschitz constant of the objective function. The optimization problem is: \begin{equation} \...

Question 9

Problem: Let $f: \mathbb{R}^n \rightarrow \mathbb{R}$ be a $C^2$ function. And suppose that for any $x,y \in \mathbb{R}^n$, $\|\nabla f(x) - \nabla f(y)\| \leq \|x-y\|$. We let $P$ be a ...

Question 10

I'm working on a chess engine which I am building from scratch (from NumPy). I used deep learning with BCE and games generated by a basic "engine" which gives a score to each move (checks, ...

Question 11

I undestand using equations why the steepest descent is along the gradient vector. But I have this diagram of level surfaces I jotted down: So, in this diagram, the distance to the surface $y=c+1$ is ...

Question 12

Say I want to maximize a concave, non-differentiable function $f(\mathbf{x})$ subject to $\mathbf x \in C$ where $C$ is a convex set. We assume that the projecting to $C$ is easy to compute and the ...

Question 13

Let $f: \mathbb{R}^n \rightarrow \mathbb{R}$ be a $C^2$ function. We consider gradient descent. We have $x_{k+1} = x_{k} + \alpha_{k}p_{k}$, where $p_{k}$ is the descent direction. We have the ...

Question 14

I'm an engineer and I've recently moved from a Hardware Design position to one in Software, which has meant some new pure Computer Science problems have landed on my desk, and a few that are purely ...

Question 15

I need to establish the following properties of $f_t$. For reference, I am taking a Numerical Optimization class and this is one of the Exercises. There are three things we need to establish. Define ...

Stack Exchange Network

Questions tagged [gradient-descent]

SGD convergence when visit basin of attraction infinitely often

Length bound for a ray lift-off from a critical point of a polynomial

Deriving adjoint equation of Poisson equation using Lagrange multiplier

How to perform gradient descent when there is large variation in the magnitude of the gradient in different directions near the minimum?

Numeric derivative computation in a recursive one-step predictor way

$q$-linear convergence of Preconditioning Gradient Descent

Looking for a gradient in Batch Normalization

Lipschitz Constant for Constrained Least Squares Sigmoid Regression

Convergence Analysis of Preconditioned GD

Implementing Temporal Differencing with Binary Cross Entropy

Steepest descent along a direction not normal to level curve?

State of the art constrained optimization methods

Convergence of gradient under Armijo condition

What is the best way to traverse 3D space in discrete steps to find the absolute maximum of a function in a bounded region?

Minimizer and First Order Properties for Numerical Optimization

Hot Network Questions