8
$\begingroup$

I am trying to generate $M$ random numbers which are exponentially distributed and whose sum adds up to $N$ (for simplicity, $N=1$).

I found that the generated numbers are initially exponentially distributed. However, after re-scaling they become uniformly distributed. What is the reason for that? And is there a solution?

Here is the result:

enter image description here

Any suggestions would be greatly appreciated.

P.S. My code written in Matlab:

subplot(121) samples = 10000; lambda = 1; X = -log(rand(samples,2))/lambda; hist(X(:,1),100) subplot(122) X = X./sum(X,2); % re-scaling hist(X(:,1),100) 
$\endgroup$
2
  • 2
    $\begingroup$ Note that sum(X,2) computes the sum of rows. $\endgroup$ Commented May 25, 2019 at 15:36
  • 3
    $\begingroup$ There is no such thing as "random numbers which are exponentially distributed and whose sum adds up to N:" an exponential distribution assigns some probability to arbitrarily large numbers, whereas limiting the sum to $N$ eliminates that possibility. Could you clarify what you actually need to accomplish? $\endgroup$ Commented May 25, 2019 at 21:23

2 Answers 2

8
$\begingroup$

[Answer revised in view of helpful comment from @whuber.]

If you know the exponential rate $\theta,$ then dividing by $M\theta$ will give you a total near $N=1.$ If you don't know $\theta$ and $M$ is sufficiently large that the $\theta$ is well approximated by the reciprocal of the sample mean (a random variable), then you can still come close.

In what follows, I assume $\theta$ is unknown and $M$ is large. Then I adjust by dividing by the sum.

In R:

set.seed(525) x = rexp(10000); y = x/sum(x) sum(y) [1] 1 hist(y, prob=T, ylim=c(0, 10000), col="skyblue2") curve(dexp(x, 1/mean(y)), add=T, col="red", lwd=2, n = 10001) 

enter image description here

Even with smaller $M = 100,$ the adjusted sample has sum $1$ and is nearly exponential.

set.seed(1234) x = rexp(100); y = x/sum(x) sum(y) [1] 1 sum(x) [1] 97.64598 ks.test(y, "pexp", 100) One-sample Kolmogorov-Smirnov test data: y D = 0.084865, p-value = 0.4674 alternative hypothesis: two-sided 

The binning is slightly inconsistent between the two histograms below because $Y = X/97.646,$ not $X/100.$

enter image description here

$\endgroup$
2
  • 3
    $\begingroup$ This answer is an illusion owing to the large sample size. Try it with, say, a sample of 3 rather than 10,000. You will have to repeat your experiment a few times, but it will quickly become apparent that the distribution you create is far from exponential. $\endgroup$ Commented May 25, 2019 at 21:22
  • 1
    $\begingroup$ Sorry and thanks. Somehow I was focused on approximate results and M large. Accordingly, I made changes to my Answer. $\endgroup$ Commented May 25, 2019 at 23:07
8
$\begingroup$

Note that what you ask for is impossible (as noted by user whuber in a comment). First, conditioning on the sum induces dependence, second, as the exponential distribution is unbounded above, it cannot equal a distribution which is bounded (by the conditioning on the sum), or at most approximately (if the bound $N$ is large, and $M$ is large).

If $X_1, X_2, \dotsc, X_M$ is iid exponential with rate $\lambda$, and the sum is $S$, then the conditional density is $$ f(x_1, \dotsc, x_m | s) $$ and using your simplification $S=N=1$ becomes $$ \frac{\lambda e^{-\lambda x_1}\cdots \lambda e^{-\lambda x_{M-1}\cdot \lambda e^{-\lambda (1-\sum_i^{M-1} x_i)}}}{\frac{\lambda^M}{\Gamma(M)} s^{M-1} e^{-\lambda s}} $$ (with $s=1$), since the sum $S$ has a gamma distribution. This simplifies to $$ \Gamma(M)\frac{e^{-\lambda}}{e^{-\lambda}}. $$ In the formula above we have omitted the restrictions on the $x_i$, which is that it pertains to the $M$-simplex, which has volume $1/\Gamma(M)$, showing that this density integrates to 1, as it should.

So this is the uniform distribution on the simplex, a special case of the symmetric Dirichlet distribution. The marginal distributions then are beta distributions $(1, M-1)$.

$\endgroup$
2
  • 2
    $\begingroup$ I do not follow your comment about "truncated distribution." Although a random variable might have unbounded support, any iid sample from it will be bounded. The remainder of this post recapitulates the thread at stats.stackexchange.com/questions/252692. $\endgroup$ Commented Jul 4 at 17:46
  • 1
    $\begingroup$ @whuber: I was searching, but did not find that post of yours $\endgroup$ Commented Jul 4 at 19:38

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.