We are looking at the function boxIntersection.
I'll reiterate the information from the comment:
// axis aligned box centered at the origin, with size boxSize
- We are working with a box.
- Centered at the origin.
- And it is axis aligned.
And let us read the parameter names:
ro is ray origin. rd is ray direction. boxSize is the size of the box. outNormal is the computed normal.
We will pretend for an instant that we don't have the code and see how we can approach this… By the way, I'll be writing pseudo-code.
As you probably know, the points along the ray can be described like this:
pt = ro + rd * t
However, let us step back a bit, and let us work on individual axis:
pt.x = ro.x + rd.x * t pt.y = ro.y + rd.y * t pt.z = ro.z + rd.z * t
And, of course, we want to compute the collision of that ray with the box. Let us start with just one of its planes... Since the box is axis aligned, and boxSize has its dimensions, there should be a plane of the box at boxSize.x * 0.5... Except that does not seem to match the shader. It appears there is a plane at boxSize.x instead. Which lead us to this equation:
ro.x + rd.x * t = boxSize.x
If we solve for t, we get this:
t = (boxSize.x - ro.x) / rd.x
If we do the same for all planes of the box we have:
ta = (-boxSize.x - ro.x) / rd.x tb = (-boxSize.y - ro.y) / rd.y tc = (-boxSize.z - ro.z) / rd.z td = (boxSize.x - ro.x) / rd.x te = (boxSize.y - ro.y) / rd.y tf = (boxSize.z - ro.z) / rd.z
Which we can do in two lines:
t1 = (-boxSize - ro) / rd t2 = (boxSize - ro) / rd
We can further approach the code we are trying to understand one step at a time.
First, in both lines we divide by rd, which is the same as multiplying by 1.0/rd, we can pre-compute that:
m = 1.0/rd t1 = (-boxSize - ro) * m t2 = (boxSize - ro) * m
We can distribute those products:
m = 1.0/rd t1 = m * -boxSize - m * ro t2 = m * boxSize - m * ro
And m * ro is something else we can also pre-compute:
m = 1.0/rd n = m * ro t1 = m * -boxSize - n t2 = m * boxSize - n
And we pre-compute m * boxSize:
m = 1.0/rd n = m * ro k = m * boxSize t1 = -k - n t2 = k - n
Ok, ok, let us write n first so it looks like the code we want:
m = 1.0/rd n = m * ro k = m * boxSize t1 = - n - k t2 = - n + k
We are missing an absolute value, we will get to that.
At this point it might help to have a geometric interpretation of these variables...
It might help to look at only one axis again, since these t1 and t2 should not be interpreted as position or vectors in space:
m.x = 1.0/rd.x n.x = m.x * ro.x k.x = m.x * boxSize.x t1.x = - n.x - k.x t2.x = - n.x + k.x
Remember that the t values (t1.x and t2.x for the x axis) are parameters to find the points along the ray. In this formula:
pt = ro + rd * t
One way of thinking of that is this: t tells you how many rds the ray needs to advance to reach the point.
Thus, we are working in rd units. Hence, it makes sense that we divide by rd.
As per the other variables - once we stop worrying about the division by rd - we can see that k is the separation between the planes of the box in rd units. And since the box is centered the origin, but the ray starts at ro, we also need to offset that by the ro... but also in rd units, i.e. n.
With that understanding we can understand that adding the absolute value to k, as shown below, would give us the same t values, it just changes the order in which we get them.
m = 1.0/rd n = m * ro k = abs(m) * boxSize t1 = - n - k t2 = - n + k
In what order do we get the t values? Well - assuming that boxSize is positive, which is not checked - k should be positive.
And thus t1 would have smaller distances than t2. We can think of t2 as the back face of the box, and t1 as the front face.
We would not have gotten the back and front faces separated like this without the absolute value.
Let us also reason about the sign of n. In order to do so, it might help to change from this:
m = 1.0/rd n = m * ro
To this:
n = ro / rd
And, again, let us look at only one axis:
n.x = ro.x / rd.x
So a components of n is positive if the component of ro and rd have the same sign. In other words, if the ray is going away from the origin.
Ergo, the components of - n are positive when the ray is looking towards the origin.
This makes sense given that when the box (which is centered at the origin) is behind the ray origin, the ray would have to travel backwards to intercept it. So when the ray is looking away from the origin (the center of the box) we get a negative - n. In that situation the ray might still collide with the box, if it is large enough. Which would be a case of t2 = - n + k (the back face), never t1 = - n - k (because - remember - k should be positive… unless the size of the box is inverted… And the intersection should be in front of the ray, and thus a positive t).
Back to the code:
m = 1.0/rd // rd unit conversion n = m * ro // origin in rd steps k = abs(m) * boxSize // box size in rd steps t1 = - n - k // front face ts t2 = - n + k // back face ts
Let us narrow our search for the correct t…
I want you to imagine the box, and you are looking at one of the cornets. The planes of the front faces extend to infinity.
If we get the minimum (the first plane that the ray intersect) we would see these planes that cover the whole picture partitioning it in three and meeting at the corner of the box. The corner of the box is the point further away from the camera that we see.
But behind the front faces of the box, the box does not have any other front faces. In fact if we get the maximum, we get the planes coming out of the corner of the box and escaping to infinity (they won't cover the whole picture, at least not in perspective projection). The corner of the box is the point closer to the camera that we see.
So we want… The maximum:
m = 1.0/rd // rd unit conversion n = m * ro // origin in rd steps k = abs(m) * boxSize // box size in rd steps t1 = - n - k // front face ts t2 = - n + k // back face ts front_t = max(t1.x, max(t1.y, t1.z))
And for the back faces we want the opposite:
m = 1.0/rd // rd unit conversion n = m * ro // origin in rd steps k = abs(m) * boxSize // box size in rd steps t1 = - n - k // front face ts t2 = - n + k // back face ts front_t = max(t1.x, t1.y, t1.z) back_t = min(t2.x, t2.y, t2.z)
Well, except the code does it a bit different:
m = 1.0/rd // rd unit conversion n = m * ro // origin in rd steps k = abs(m) * boxSize // box size in rd steps t1 = - n - k // front face ts t2 = - n + k // back face ts tN = max(max(t1.x, t1.y), t1.z) tF = min(min(t2.x, t2.y), t2.z)
Now we understand that the N in tN stands for NEAR, and the F in tF stands for FAR.
The next line I want to read from the code:
if( tN>tF || tF<0.0) return vec2(-1.0); // no intersection
This is checking two things:
If the near face of the box is further away from the ray origin than the far face of the box tN>tF, there is no intersection.
How can tN>tF happen? Well, there are two situations:
- Remember that we are considering these to be near and far (front and back) faces of the box based on an assumption of the sign of
k, which is based on an assumption on the sign of boxSize. So tN>tF happens if the box is inverted (inside out). - The near planes also extend away from the camera, so you will find intersections with them that are behind the far planes. Similarly, the far planes also extend towards the camera… Thus,
tN>tF happens out of the bounds of the box.
If the far face of the box is behind the ray origin tF<0.0 there is no intersection.
First version of the normal computation:
outNormal = (tN>0.0) ? step(vec3(tN),t1)) : // ro ouside the box step(t2,vec3(tF))); // ro inside the box
I don't think that is right. The parenthesis do not seem to match.
Anyway, let us start by making sense of the comments. When tN>0.0 it means that the near face is in front of the ray, and thus the ray origin is outside of the box.
Now, the step function will return 0.0 or 1.0 depending on the comparison of the arguments. In particular step(edge, x)
Let us expand step(vec3(tN),t1) to its components:
vec3( // 1.0 if tN < t1.x otherwise 0.0 step(tN,t1.x), // 1.0 if tN < t1.y otherwise 0.0 step(tN,t1.y), // 1.0 if tN < t1.z otherwise 0.0 step(tN,t1.z) )
Instead of writing 1.0 if blah otherwise 0.0, I'll write blah, and take that true is 1.0 and false is 0.0:
vec3( // tN < t1.x step(tN,t1.x), // tN < t1.y step(tN,t1.y), // tN < t1.z step(tN,t1.z) )
So we get a 1.0 for any component such that tN is smaller than it. In other words, we get a 1.0 for any component greater or equal to tN.
As we know, tN is the maximum of t1.x, t1.y, and t1.z. So we get a 1.0 for any component greater or equal to the maximum of the components.
This means you get (1.0, 1.0, 1.0) for the corners. For the sides of the cube you get a 1.0 along the axis normal to the face. And for the edges between two faces you get a 1.0 along the axis normal of each face that meet.
For the other case we do similarly with tF, except the order is flipped because tF have the minimum not the maximum.
Thus, this (parenthesis fixed by me):
outNormal = (tN>0.0) ? step(vec3(tN),t1): // ro ouside the box step(t2,vec3(tF)); // ro inside the box
Gives you a vector with a 1.0 along the axis normal to each face the ray hits.
And the next line:
outNormal *= -sign(rd);
Gives it the opposite sign to the direction of the ray.
Second version of the normal computation:
return -sign(rd)*step(t1.yzx,t1.xyz)*step(t1.zxy,t1.xyz);
Again, the factor -sign(rd) gives us the opposite sign to the direction of the ray.
The rest is swizzling trickery, let us break it down. We have:
return -sign(rd) * step(t1.yzx,t1.xyz) * step(t1.zxy,t1.xyz);
Which is the same as:
return -sign(rd) * vec3( // t1.y < t1.x step(t1.y,t1.x), // t1.z < t1.y step(t1.z,t1.y), // t1.x < t1.z step(t1.x,t1.z) ) * vec3( // t1.z < t1.x step(t1.z,t1.x), // t1.x < t1.y step(t1.x,t1.y), // t1.y < t1.z step(t1.y,t1.z) );
I remind you that when the expression I wrote is true we have a 1.0, otherwise it is a 0.0. Therefore, when we multiply them, it is equivalent to doing an AND on the expressions.
So, the above, is the same as:
return -sign(rd) * vec3( // t1.y < t1.x AND t1.z < t1.x step(t1.y,t1.x) * step(t1.z,t1.x), // t1.z < t1.y AND t1.x < t1.y step(t1.z,t1.y) * step(t1.x,t1.y), // t1.x < t1.z AND t1.y < t1.z step(t1.x,t1.z) * step(t1.y,t1.z) );
Which means this is what we are doing:
return -sign(rd) * vec3( // t1.x is the maximum step(t1.y,t1.x) * step(t1.z,t1.x), // t1.y is the maximum step(t1.z,t1.y) * step(t1.x,t1.y), // t1.z is the maximum step(t1.x,t1.z) * step(t1.y,t1.z) );
Notice that this code only consider the front faces. Thus it will work correctly as long as the origin of the ray is outside the box.
Although the opposite faces have the same normal, when the ray is inside the box, this code would be selecting the wrong one.
I found a third version of the code in Box - intersection. This version also comments on the checking both near and far faces:
#if 1 // this works as long as the ray origin is not inside the box vec4 res = vec4(tN, step(tN,t1) ); #else // use this instead if your rays origin can be inside the box vec4 res = (tN>0.0) ? vec4( tN, step(vec3(tN),t1)) : vec4( tF, step(t2,vec3(tF))); #endif
That version of the code also takes two matrices which represents transformation from and to the box space. Which are used to encode rotation and translation of the box. The code converts the ray into box space, and proceeds as usual, then converts the result back to ray space.