How to compute this integral marginalizing out Gumbel noise?

Question

I wanted to try to derive the Gumbel softmax trick for myself from scratch and I get stuck on the last step of this webpage where it says the integral has a closed form which arrives at the solution.

To recap what is on that page...

The Gumbel distribution is given by the PDF $f(x) = e^{-(z - \mu + e^{-(z - \mu)})}$ . It is easy to show that the PDF is a proper distribution and integrates to one, as the CDF is given by the following if we assume $\mu = 0$ ,

$$ \begin{aligned} F(x) &= \int_{-\infty}^\infty e^{-(z + e^{-z })} \\ &= \int_{-\infty}^\infty e^{-z} e^{-(e^{-z })} \\ &= \left[ e^{-(e^{-z})} \right]^\infty_{-\infty} \\ &= \frac{1}{e^{\frac{1}{e^{\infty}}}} - \frac{1}{e^{\frac{1}{e^{-\infty}}}} \\ &= 1 - 0 \end{aligned} $$

Therefore, if we sample some gumbel noise for each logit, it will result in some outcome $z_k$ given the location $x_k$ , therefore the probability that $z_k$ is the largest given $x_k$ and all the $x_k$ 's is given by the following expression which uses the CDF derived above. (which gives the probability that the outcome is less than $z_k$ .)

$$ p(z_k \text{ is the largest } | z_k, \{x_k^\prime\}_{k^\prime = 1}^K) = \prod_{k^\prime \neq k} e^{-e^{-(z_k - x_{k^\prime})}} $$

We now have to do some integrating to get the final probability that k is the largest given the logits,

$$ \begin{aligned} p(z_k \text{ is the largest } | \{x_k^\prime\}_{k^\prime = 1}^K) &= \int p(z_k \text{ is the largest } | z_k, \{x_k^\prime\}_{k^\prime = 1}^K) \;\; p(z_k)\;\; dz_k \\ &= \int e^{-(z_k - x_k + e^{-(z_k - x_k)})} \prod_{k^\prime \neq k} e^{-e^{-(z_k - x_{k^\prime})}} dz_k \\ &= \int e^{-(z_k - x_k + e^{-(z_k - x_k)})} e^{-\sum_{k^\prime \neq k} e^{-(z_k - x_{k^\prime})}} dz_k\\ &= \int e^{(-z_k + x_k - e^{-z_k + x_k})-\sum_{k^\prime \neq k} e^{(-z_k + x_{k^\prime})}} dz_k \\ &= \int e^{(-z_k + x_k) -e^{-z_k} \sum_{k^\prime} e^{x_{k^\prime}}} dz_k \\ &= \dots \\ &= \frac{e^{x_k}}{\sum_{k^\prime} e^{x_k^\prime}} \end{aligned} $$

The problem is that every source I can find online skips to the end and says the integral has a closed form without showing how to get there. I have tried to perform the integral above and failed. Is there any resource or integration trick which can show why this is true?

grand_chat · Accepted Answer · 2023-01-31 06:09:23Z

The integral you are trying to evaluate is with respect to $z_k$ and the limits run from $z_k=-\infty$ to $z_k=\infty$. Factoring out the constant $e^{x_k}$ and abbreviating $c:=\sum_{k'}e^{x_{k'}}$, the integral can be written $$ e^{x_k}\int_{z_k=-\infty}^\infty e^{-z_k} \exp\left(-ce^{-z_k}\right)dz_k. $$ Make a change of variables: $u:=e^{-z_k}$, $du=-e^{-z_k}dz_k$. Then the integral becomes $$ e^{x_k}\int_{u=\infty}^0 e^{-cu}(-du)=e^{x_k}\int_{u=0}^{\infty} e^{-cu}du= e^{x_k}\frac1c=\frac{e^{x_k}}{\sum_{k'}e^{x_{k'}}}. $$

thanks. It always looks too easy when someone else does it, but its hard to do alone — Joff
– Joff, Commented Jan 31, 2023 at 7:12

Stack Exchange Network

How to compute this integral marginalizing out Gumbel noise?

1 Answer 1

You must log in to answer this question.

Hot Network Questions

How to compute this integral marginalizing out Gumbel noise?

1 Answer 1

You must log in to answer this question.

Related

Hot Network Questions