2
$\begingroup$

I've been looking into Energy Based Models recently which Yann LeCun has been strongly advocating for. One problem that he lists with probabilistic based models is that in the case when there are multiple possible outputs for one given input, the probabilistic model will return the Expected Value of the possible outputs. An example is if a model is given the task of completing a video of soccer ball being kicked, the possible output videos could have the ball going left, straight, or right. All our possible outputs. However many models will just return the expected value which means the output will be a really noisy blurred image which makes sense.

My Question is how do Energy Based Models solve this problem. What example architecture is there that would solve this and why is this so.

$\endgroup$

2 Answers 2

0
$\begingroup$

think of a GAN… the discriminator is just asked if a specific $x$ works for a specific $y$, and thus you can think of the discriminator as an energy model

indeed, this way, you don’t have to average over possible “answers”, but just to learn a possible completion

$\endgroup$
0
$\begingroup$

Indeed the usual probabilistic models such as VAE whose main drawback is that samples from the model trained on images tend to be somewhat blurry possibly due to your mentioned inherent effect of maximum likelihood or equivalently KL-Divergence between true input data generating and model distributions.

Energy-based models (EBMs) offer a different approach to handling situations where there are multiple possible outputs for a given input compared to probabilistic models. In EBMs, instead of directly modeling the probability distribution over outputs, the model learns a scalar-valued energy function that assigns low energy to plausible outputs and high energy to implausible ones. This energy function captures the compatibility between inputs and outputs and allows the model to generate diverse outputs without the usual noisy blurred effect.

One example architecture that uses EBMs to address this problem is the Generative Adversarial Network (GAN). The generator network in a GAN can be viewed as an energy-based model. It learns to generate outputs that have low energy (i.e., are plausible) according to the discriminator's judgment and usually with better output effect in terms of noisy blurriness compared to VAE's.

$\endgroup$

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.