Questions tagged [variational-inference]

Question 1

The Google Deepmind paper "Weight Uncertainty in Neural Networks" features the following algorithm: Note that the $\frac{∂f(w,θ)}{∂w}$ term of the gradients for the mean and standard ...

Question 2

I am modelling the the sequence $\{(a_t,y_t)\}_t$ as follows: $$ \begin{cases} Y_{t+1} &= g_\nu(X_{t+1}) + \alpha V_{t+1}\\ X_{t+1} &= X_t + \mu_\xi(a_t) + \sigma_\psi(a_t)Z_{t+1}\\ X_0 &= ...

Question 3

I would like to perform clustering with a finite Gaussian Mixture model, however, I have missing data (some features are missing at random). I am using Variational Inference to fit my Bayesian GMM. Is ...

Question 4

I’ve been working on implementing a binary variant of probabilistic PCA (PPCA) in Python (based on this paper), which uses variational EM for parameter estimation due to the non-conjugacy between the ...

Question 5

I am following the Zuko "Train From Data" tutorial to train a Neural Spline Flow. My goal is to approximate a distribution over functions. Therefore, each of my function samples are actually ...

Question 6

As mentioned in the title, I understand the mathematical derivation of equations (6-7) in Kingma's original paper. \begin{equation} \log p_\theta(\mathbf{x}, y) \geq \mathbb{E}_{q_\phi(\mathbf{z} \mid ...

Question 7

Using normalizing flows, we can model model's posteriors $p(\theta|D)$, by feeding Gaussian noise $z$ to the NF (parametrized with $\phi$), using the output of the NF $\theta$ as model parameters, and ...

Question 8

I'm deeply failing to understand the first step in the ELBO derivation in VAEs. When asking my questions I'll also try to clearly state my assumptions and perhaps some of them are wrong to begin with: ...

Question 9

I have an estimation problem where I need to maximize the evidence lower bound: $$ \mathrm{ELBO} = -\frac{1}{2} \Bigg( \mathbb{E}_{q(\theta)} \left[ \mathrm{vec}(\mathbf{Z})^{\mathrm{H}} \mathbf{C}^{-...

Question 10

so confused with the derivation of elbo. In part of the derivation p(data) is intractable as it involves an integral over a high dimensional latent variable. I cant understand why the latent ...

Question 11

I am a bit lost with the derivation of ELBO because I dont understand why some distributions are known and some are unknown. I guess we know p(z) (the prior) because it was the last value of q(z) ...

Question 12

On wikipedia it says: "A simple interpretation of the KL divergence of P from Q [i.e. D_KL(P||Q)] is the expected excess surprise from using Q as a model instead of P when the actual distribution ...

Question 13

I recently trained a AE and a VAE and used the latent variables of each for a clustering task. It seemed to work well, sensible clusters. The main reason for training the VAE was too gain more ...

Question 14

In Variational Autoencoders (VAE), we have: $$ \log p_\theta(x) = \log \left[ \int p_\theta(x \mid z)p(z) \, dz \right] $$ where $ p_\theta(x \mid z) = \mathcal{N}(x; \mu_\theta(z), I) $ and $ p(z) = \...

Question 15

I've seen in many tutorials on diffusion models refer to the distribution of the latent variables induced by the forward process as "ground truth". I wonder why. What we can actually see is ...

Stack Exchange Network

Questions tagged [variational-inference]

Bayes-by-backprop - meaning of partial derivative

Normalizing observations in a nonlinear state space model

Bayesian Clustering with a Finite Gaussian Mixture Model with Missing Data

Why Does the Posterior Estimation of Latent Variables in Binary PPCA is different from Ground Truth?

Normalizing Flow with Highly Negative NLL Loss

Why is the target 𝑦 used as an input to the encoder in a semi-supervised VAE model?

Posterior estimation using VAE

VAEs - Two questions regarding the posterior and prior distribution derivations

How to speed up the following ELBO evaluation?

Why do we need to marginalize when finding p(data) when latent variables are involved? (part of elbo derivation)

When deriving ELBO to solve variational inference problems why do we know p(z) and p(x,z) but not p(x) and p(z|x)?

ELBO & "backwards" KL divergence argument order

Exploring vae latent space

Why sampling from the posterior is a good estimate for the Likelihood but sampling from the prior is bad?

Why is the forward process referred to as the "ground truth" in diffusion models?

Hot Network Questions