0
$\begingroup$

I am doing probability density functions for Calculus 2 and came across a problem where I had to find the mean for a piecewise function. I looked up how to find the mean in this case and the equation was $$f(x) = \begin{cases} g(x)\:\:\:\:a\le x \le b\\ h(x)\:\:\:\:b \le x \le c \end{cases} $$ $$E(x) =\int_a^b xg(x)\,dx\ +\int_b^cxh(x)\,dx$$

I am confused why you can simply add the means without having to divide by something like 2 to obtain an average of the means. My main reason for confusion is that if I obtained a mean of 2 for the first interval and 4 for the second, I would expect the mean for the whole piecewise function to lie somewhere in between 2 and 4, not at 6

$\endgroup$
0

4 Answers 4

5
$\begingroup$

If we accept that the expectation of a continuous-valued random variable $X$ is $$\operatorname{E}[X] = \int_{x =-\infty}^\infty x f_X(x) \, dx$$ where $f_X$ is its probability density, then the piecewise formula is a direct consequence of the additive property of integration $$\int_{x=a}^b f(x) \, dx + \int_{x=b}^c f(x) \, dx = \int_{x=a}^c f(x) \, dx,$$ for $a \le b \le c$.

Why is your intuition wrong? It is because $g$ and $h$ do not themselves integrate to $1$ over their respective supports; i.e., they do not, in themselves, constitute a probability density. Only $f$ is a density.

Consider the following example. If $$f(x) = \begin{cases} x^2, & 0 \le x \le 1 \\ 1, & 1 < x \le 5/3 \end{cases}$$ and $0$ otherwise, it is easy to confirm that $f$ is a density: $$\int_{x=0}^{5/3} f(x) \, dx = \int_{x=0}^1 x^2 \, dx + \int_{x=1}^{5/3} \, dx = \frac{1}{3} + \frac{5}{3} - 1 = 1.$$ And this calculation also shows that neither $x^2$ on $[0,1]$ nor $5/3$ on $(1, 5/3]$ are densities. So your "averaging" step after the calculation of the piecewise integrals is unnecessary, because loosely speaking, those integrals have already been "weighted" appropriately to take into account their separate contributions to the expectation, by virtue of the fact that their integrands do not integrate to $1$ on the intervals for which they were defined.

$\endgroup$
2
$\begingroup$

I'm probably going to confuse you with the first part of this answer, because you have some misconceptions about what you're doing. I want to show you the correct idea, but first I'm going to show what's wrong with the idea you have.

As you know, if you have five measurements such as $3,$ $4,$ $9,$ $4,$ and $7,$ the mean of those measurements isn't simply their sum. You have to divide by $5,$ like this: $$ \frac{3 + 4 + 9 + 4 + 7}{5} = 5.4. $$

So how can $\displaystyle\int_a^b xg(x)$ be the mean of anything? Where's the denominator?

The fact is, this exercise is not finding the mean of any function. If it were, the answer would be between the minimum and maximum values of the function over some measurable subset of the real numbers. For example, the measure of the closed interval $[5,9]$ is $4.$ Therefore, the mean value of the function $u(x) = \frac18(x - 5)$ on the closed interval $[5,9]$ is $$ \frac{\displaystyle\int_5^9 \frac18(x - 5)\,\mathrm dx}{4} = \frac14. $$ This makes sense, because the minimum value of $u(x)$ on the interval $[5,9]$ is at $x=5,$ where $u(5)=0,$ and the maximum occurs at $x=9,$ where $u(9)=\frac12.$ And $\frac14$ is nicely between $0$ and $\frac12.$

But what you're supposed to be doing in your probability exercises is computing a "mean" that looks more like this: $$ \int_5^9 x u(x)\,\mathrm dx = \int_5^9 x \left(\frac18(x - 5)\right)\,\mathrm dx = \frac{23}{3} \approx 7.66667. $$ The number $7.66667$ isn't anywhere near to being between the minimum and maximum values of $u(x)$ in this integral. What $7.66667$ is between is the two ends of the interval on which you integrated the function, $5$ and $9.$

You're not finding a mean value of the function $u(x).$ You're finding a kind of mean value of all the numbers in the interval $[5,9].$

This particular kind of mean is called the expected value of the continuous probability distribution defined by a probability density function $u(x)$ that is equal to $\frac18(x - 5)$ for $x$ in the interval $[5,9]$ and is zero for every other $x.$

There is still the question of where the denominator is. Let's consider it.

The expected value of a probability distribution -- the thing you're computing in this exercise -- is what we call a weighted mean. It's "weighted" because all of the numbers that are being averaged count, but some of them may count more than others.

In a weighted mean, you multiply each number by its weight before adding it to the sum. At the end, when all the weights-times-numbers have been added, you divide by the total weight. That is, to get the mean of the numbers $x_1, x_2, \ldots, x_n$ with weights $w_1, w_2, \ldots, w_n$ we compute this: $$ \frac{x_1 \cdot w_1 + x_2 \cdot w_2 + \cdots + x_n \cdot w_n} {w_1 + w_2 + \cdots + w_n}. $$

The way we took the mean of the $3,$ $4,$ $9,$ $4,$ and $7$ can be interpreted as a weighted mean where every number has weight $1.$ Since there are five numbers, the total weight is $5.$ Plugging all this into the formula, the mean is $$ \frac{3 \cdot 1 + 4 \cdot 1 + 9 \cdot 1 + 4 \cdot 1 + 7 \cdot 1}{1 + 1 + 1 + 1 + 1} = \frac{3 + 4 + 9 + 4 + 7}{5} = 5.4 $$ -- exactly the same as before, but more explicit about how much each number "counts" in the total.

A completely equivalent way to take the mean of these five numbers is to again give them equal weights, but the weight of each number is $\frac15.$ The mean is $$ \frac{3\cdot\frac15 + 4\cdot\frac15 + 9\cdot\frac15 + 4\cdot\frac15 + 7\cdot\frac15} {\frac15 + \frac15 + \frac15 + \frac15 + \frac15} = \frac{5.4}{1} = 5.4. $$ Now you might notice that since these particular weights added up to $1,$ we actually had the answer already as soon as we added up the weighted numbers in the numerator. Dividing by $1$ didn't change the answer. So when people know that the weights add up to $1,$ they often don't bother to write the denominator of the mean. They just write the sum $x_1 \cdot w_1 + x_2 \cdot w_2 + \cdots + x_n \cdot w_n.$

For the expected value of a continuous probability distribution, we no longer have just a finite list of numbers to be averaged. We're taking the weighted average of every number on the real line using the probability density function as the weight of each number.

So if $w(x)$ is a probability density function, again we multiply every number by its weight $w(x)$ and "add them up" by computing an integral in the numerator of the mean value, and we "add up" all the weights by integrating them in the denominator:

$$ \frac{\displaystyle\int_{-\infty}^\infty x \cdot w(x)\,\mathrm dx} {\displaystyle\int_{-\infty}^\infty w(x)\,\mathrm dx}. $$ But part of the definition of a continuous probability distribution is that the integral of the probability density function is $1$: $$ \int_{-\infty}^\infty w(x)\,\mathrm dx = 1. $$ So the denominator is always $1,$ and we already have the answer by the time we've evaluated the numerator. So people usually don't even bother to write the denominator.

Also, if we know the probability density $w(x)$ is zero everywhere except for $x$ within some interval $[a,c],$ we only need to integrate from $a$ to $c,$ not from $-\infty$ to $\infty.$ It looks like your course may be glossing over that bit, because your definition of $f(x)$ didn't say what the value of $f(x)$ should be if $x < a$ or $x > c.$ I assume the implicit assumption was that $f(x)$ is defined by $$ f(x) = \begin{cases} 0 & x < a \\ g(x) & a \leq x < b \\ h(x) & b \leq x \leq c \\ 0 & x > c. \end{cases} $$ (Note that I also made sure that this definition makes sense even if $g(b) \neq h(b),$ because in this formula $f(b)$ is defined by $h(b)$ and not by $g(b).$)

So now all you have to do is compute $$ \int_{-\infty}^\infty x f(x)\,\mathrm dx. $$ To make the computation a little easier you use the fact that you can split the interval of a definite integral into two or more pieces, integrate each piece individually, and add up the pieces to get the correct total. This has nothing to do with "averaging" the pieces; it's just a technique for computing a complicated integral: $$ \int_{-\infty}^\infty x f(x)\,\mathrm dx = \int_{-\infty}^a x f(x)\,\mathrm dx + \int_a^b x f(x)\,\mathrm dx + \int_b^c x f(x)\,\mathrm dx + \int_c^\infty x f(x)\,\mathrm dx. $$

So let's do a specific example. Define $f(x)$ as follows: $$ f(x) = \begin{cases} 0 & x < 6 \\ \frac14(x - 6) & 6 \leq x < 8 \\ x - 8 & 8 \leq x \leq 9 \\ 0 & x > 9. \end{cases} $$

You should first confirm that the integral of $f(x)$ from $-\infty$ to $\infty$ is $1.$ Then $f(x)$ is the probability density function of a probability distribution whose expected value is $$ \int_{-\infty}^\infty x f(x)\,\mathrm dx. $$

We can integrate this in pieces as follows: \begin{align} \int_{-\infty}^6 x f(x)\,\mathrm dx &= \int_{-\infty}^6 0\,\mathrm dx = 0, \\ \int_6^8 x f(x)\,\mathrm dx &= \int_6^8 x\left(\frac14(x - 6)\right)\,\mathrm dx = \frac{11}{3} \approx 3.6667, \\ \int_8^9 x f(x)\,\mathrm dx &= \int_8^9 x(x - 8)\,\mathrm dx = \frac{13}{3} \approx 4.3333, \\ \int_9^\infty x f(x)\,\mathrm dx &= \int_9^\infty 0\,\mathrm dx = 0. \end{align}

Now take a close look at the result of $\displaystyle\int_6^8 x f(x)\,\mathrm dx.$ It's not a mean of the values of $f(x)$ for $6 < x < 8$ because it's greater than $\frac12,$ which is the maximum value of $f(x)$ for $x$ between those bounds. And it's not a mean of the values of $x$ between $6$ and $8,$ because it's less than $6.$ It's not a mean of anything we're interested in at all.

The reason this integral isn't a mean is because the sum of the weights used in the integral is $$ \int_6^8 f(x)\,\mathrm dx = \int_6^8 \frac14(x - 6)\,\mathrm dx = \frac12, $$ which is not equal to $1,$ and therefore the "we can ignore the denominator because it's $1$" trick doesn't work.

So when you wrote $$ E(x) = \int_a^b x g(x)\,\mathrm dx + \int_b^c x h(x)\,\mathrm dx $$ you were not adding two means. You were taking a single weighted mean of values of $x$ from $a$ to $c$ by separating the single "sum" (an integral) into two subtotals.

$\endgroup$
0
$\begingroup$

As others have said, you are not adding two means or expectations. For example here $\int_a^b g(x)\, dx = \mathbb P(X\le b)$ rather than $1$, so $\int_a^b x\, g(x)\, dx$ is not a mean or expectation.

What you seem to be thinking about is taking the weighted average of two conditional expectations, in effect

$$\mathbb E[X] = \mathbb E[X \mid X \le b]\,\mathbb P(X \le b) + \mathbb E[X \mid X \gt b]\,\mathbb P(X \gt b)$$

where here

  • $\mathbb P(X \le b) =\int_{-\infty}^b f(x)\, dx = \int_a^b g(x)\, dx$
  • $\mathbb P(X \gt b) =\int_b^\infty f(x)\, dx = \int_b^c h(x)\, dx$
  • $\mathbb E[X \mid X \le b]=\dfrac{\int_{-\infty}^b x\,f(x)\, dx}{\int_{-\infty}^b f(x)\, dx}=\dfrac{\int_a^b x\,g(x)\, dx}{\int_a^b g(x)\, dx}$
  • $\mathbb E[X \mid X \gt b]=\dfrac{\int_{b}^\infty x\,f(x)\, dx}{\int_{b}^\infty f(x)\, dx}=\dfrac{\int_b^c x\,h(x)\, dx}{\int_b^c h(x)\, dx}$

and so

$$\mathbb E[X] = \dfrac{\int_a^b x\,g(x)\, dx}{\int_a^b g(x)\, dx}\,\int_a^b g(x)\, dx + \dfrac{\int_b^c x\,h(x)\, dx}{\int_b^c h(x)\, dx}\,\int_b^c h(x)\, dx \\= \int_a^b x\,g(x)\, dx+ \int_b^c x\,h(x)\, dx.$$

$\endgroup$
0
$\begingroup$

In the question $f$ is the probability density function. Hence you simply compute the integral from the definition.

$$E_X=\int_a^cxf(x)\,dx.$$

It turns out the $f$ is defined piecewise, so we can expand as

$$E_X=\int_a^bxg(x)\,dx+\int_b^cxh(x)\,dx.$$

The two separate integrals are not means.


By the way, $f$ being a density,

$$\int_a^bg(x)\,dx+\int_b^ch(x)\,dx=1$$ must hold.

$\endgroup$

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.