The increased radiance during sampling is known by:
$$ L_s = L_e + \sigma_s \int p(w, w')f_p(w, w')dw' $$
The latter part, which accounts for in-scattering radiance bears a $\sigma_s$ in it.
The first question is, how to explain the physical functionality of $\sigma_s$ in a simple way? I think taking a look at the original differential form for volumetric RTE will be clearer? Since differential RTE clearly indicates "increase" and "decrease" of radiance.
The second problem is, since I can use (almost) any sampling strategy I want for importance sampling, and for the original exponential sampling, I can derive (during mfp-sampling): $$ \frac{f(x)}{p(x)} = \frac{\sigma_s e^{-\sigma_t d}}{\sigma_t e^{-\sigma_t d}} = \text{scattering albedo } \alpha $$ Therefore, for high albedo medium, multiple scattering actually won't introduce much energy loss since it would only be $\alpha^{n-1}$ (n is the number of scattering events). However, when I choose to use other sampling distribution instead of exponential, the estimator tends to yield small values, since $p(x)$ no-longer cancels the exponent in $f(x)$, therefore I can't get $\alpha$ (which should be close to 1 for high-scattering medium). In the extreme case, where I use deterministic method to add a mfp sample, for example:

It seems that without $p(x)$ as the denominator, placing a deterministic sample here is increasing the radiance by a large amount (if distance is small enough and $\sigma_s$ is large). So why would this happend?. And, is this correct that whenever a scattering event occurs (no matter stochatic / deterministic sampling), we should multiply a $\sigma_s$ somewhere during computation?
The third confusion of mine: deterministically placing a sample (like, during distance sampling) pdf is 1 (right?), but sampling in a extremely small interval results in a huge pdf value (for example, uniform distribution will have $1 / (\max - \min)$ which is pretty big), so what happened here? Any help will be appreciated!