How do Energy Based Models solve Multiple possible outputs given one input

Question

I've been looking into Energy Based Models recently which Yann LeCun has been strongly advocating for. One problem that he lists with probabilistic based models is that in the case when there are multiple possible outputs for one given input, the probabilistic model will return the Expected Value of the possible outputs. An example is if a model is given the task of completing a video of soccer ball being kicked, the possible output videos could have the ball going left, straight, or right. All our possible outputs. However many models will just return the expected value which means the output will be a really noisy blurred image which makes sense.

My Question is how do Energy Based Models solve this problem. What example architecture is there that would solve this and why is this so.

Alberto · Accepted Answer · 2024-03-31 11:09:57Z

think of a GAN… the discriminator is just asked if a specific $x$ works for a specific $y$, and thus you can think of the discriminator as an energy model

indeed, this way, you don’t have to average over possible “answers”, but just to learn a possible completion

cinch · Accepted Answer · 2024-03-31 22:03:24Z

Indeed the usual probabilistic models such as VAE whose main drawback is that samples from the model trained on images tend to be somewhat blurry possibly due to your mentioned inherent effect of maximum likelihood or equivalently KL-Divergence between true input data generating and model distributions.

Energy-based models (EBMs) offer a different approach to handling situations where there are multiple possible outputs for a given input compared to probabilistic models. In EBMs, instead of directly modeling the probability distribution over outputs, the model learns a scalar-valued energy function that assigns low energy to plausible outputs and high energy to implausible ones. This energy function captures the compatibility between inputs and outputs and allows the model to generate diverse outputs without the usual noisy blurred effect.

One example architecture that uses EBMs to address this problem is the Generative Adversarial Network (GAN). The generator network in a GAN can be viewed as an energy-based model. It learns to generate outputs that have low energy (i.e., are plausible) according to the discriminator's judgment and usually with better output effect in terms of noisy blurriness compared to VAE's.

Stack Exchange Network

How do Energy Based Models solve Multiple possible outputs given one input

2 Answers 2

You must log in to answer this question.

Hot Network Questions

How do Energy Based Models solve Multiple possible outputs given one input

2 Answers 2

You must log in to answer this question.

Related

Hot Network Questions