5
$\begingroup$

I've just worked my way through this OpenGL shadow mapping tutorial. While I understand the basic algorithm, one thing puzzles me: During the 2nd render pass all vertices are transformed into the clip space of the light source. This is done by multiplying them with the light's view-projection matrix in the vertex shader:

vs_out.FragPosLightSpace = lightSpaceMatrix * vec4(vs_out.FragPos, 1.0); 

However, for texture lookup into the shadow map a perspective division is needed. This is done in the fragment shader:

float ShadowCalculation(vec4 fragPosLightSpace) { // perform perspective divide vec3 projCoords = fragPosLightSpace.xyz / fragPosLightSpace.w; //continue w. texture lookup [...] } 

So my question is - why can't I perform the perspective division in the vertex shader? I did try moving the division from fragment to vertex shader in my otherwise finished shadow mapping code, and ended up with some really weird artifacts. So I guess it has something to do with the interpolation performed by the rasterizer, but I would like a more detailed explanation if possible.

$\endgroup$
2
  • 4
    $\begingroup$ "This is done in the fragment shader" That's silly. OpenGL has dedicated Proj texture accessing functions that do the divide for you. I'm surprised that the tutorial doesn't bother to use them, since letting you know they exist is half the point of such a tutorial in the first place. $\endgroup$ Commented Nov 24, 2019 at 14:46
  • $\begingroup$ If you do the perspective divide and then interpolate, you will not get the same values if you interpolate and then divide. The end points of the interpolation will be correct, but every point in between will be divided by the w value of the end point, not the interpolated w value. Doing the division in the vertex shader is the divide then interpolate case, doing the division in the fragment shader is the interpolate then divide case. $\endgroup$ Commented Sep 14, 2024 at 22:24

1 Answer 1

0
$\begingroup$

tl;dr: perspective-correct interpolation of NDCs doesn't work; they need to be linearly interpolated in screen-space instead.

This answer uses math notation from this paper, section 3: interpolating vertex attributes. (I highly recommend reading the entire paper to understand how perspective-correct interpolation of vertex attributes works; the following discussion is based on it.)


Why does the code in the tutorial work?

The hardware does perspective-correct interpolation of vertex clip-space coordinates to yield a pixel's clip-space coordinates. Then the perspective divide (by the interpolated w-coordinate) in the pixel shader transforms that into NDC.

But let's dive a bit deeper and see what the interpolation really does under the hood:

Let $I_1$ and $I_2$ be clip-space coordinates of two vertices, and let $Z_1$ and $Z_2$ be their respective view-space z-coordinate (or clip-space w-coordinate). We want to find a pixel's NDCs, or mathematically: $I_t/Z_t$ where $I_t$ is its clip-space coordinates and $Z_t$ is the view-space z-coordinate.

Now recall that NDCs are coordinates of a 3D object after projecting it on the 2D projection window (screen-space). So to find a pixel's NDCs in screen-space, it suffices to linearly interpolate the vertices' NDCs in screen-space. Mathematically:

$$ \tag{*}\label{*} \frac{I_t}{Z_t}=\frac{I_1}{Z_1}+s\left(\frac{I_2}{Z_2}-\frac{I_1}{Z_1}\right) $$

($s$ is a screen-space barycentric coordinate of a triangle, see in the paper linked above).

To get this, we first let the hardware do perspective-correct interpolation of the vertex clip-space coordinates. This is equation (16) in the paper and it gives us $I_t$ - the pixel's clip-space coords: $$ I_t=\left[\frac{I_1}{Z_1}+s\left(\frac{I_2}{Z_2}-\frac{I_1}{Z_1}\right)\right]Z_t $$ Then in the pixel shader we divide by w which is simply the $Z_t$ of a pixel (remember that w gets interpolated as well), which gives us $I_t/Z_t$ (NDC) as desired.

So the hardware does the perspective divide and interpolation for us, which gives the pixel's NDCs, but it also multiplies by $Z_t$ which transforms from NDC back to clip-space. And what the perspective divide in the pixel shader really does is transform back to NDC by cancelling out the $Z_t$.


What if we do the division in the vertex shader instead?

The vertex NDCs are $I_1/Z_1$ and $I_2/Z_2$ and the perspective-correct interpolation equation gives: $$ \left[\frac{I_1}{Z_1^2}+s\left(\frac{I_2}{Z_2^2}-\frac{I_1}{Z_1^2}\right)\right]Z_t $$ which is clearly not $I_t/Z_t$ as defined in $\eqref{*}$. The reason this equation doesn't work in this case is that its derivation requires the attribute to vary linearly across the triangle in 3D space (view space). But screen-space NDC coords don't vary linearly in 3D space because of the perspective divide, so the premise is broken. (Clip-space coords do vary linearly in view-space because they are the result of a linear transform.) Perspective-correct interpolation of values that have undergone perspective divide doesn't work. They are not in 3D space anymore, rather in 2D projection space and thus require linear interpolation in screen-space to get correct results. In practice it's accomplished using perspective-correct interpolation of clip-space coords, plus perspective divide per-pixel.


Fun fact: $Z_{ndc}$ has the form $Z_{ndc}=A+\frac{B}{z}$ (where $z$ is in view space) and is also linearly interpolated in screen-space for z-buffering:

  1. Perspective-correct interpolation of clip-space z of the form $z'=Az+B$: $$ \left[\frac{Az_1+B}{z_1}+s\left(\frac{Az_2+B}{z_2}-\frac{Az_1+B}{z_1}\right)\right]z_t = \left\{A+\frac{B}{z_1}+s\left[A+\frac{B}{z_2}-\left(A+\frac{B}{z_1}\right)\right]\right\}z_t $$
  2. Perspective divide per-pixel (div by $z_t$): $$ A+\frac{B}{z_1}+s\left[A+\frac{B}{z_2}-\left(A+\frac{B}{z_1}\right)\right] $$

Thus both steps amount to a lerp of $Z_{ndc}$.

$\endgroup$

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.