Return to Answer

Elaborating on matrix description.

edited Jun 11 at 13:01

140.8k
23
257
401

Yes.

In games, we'll typically represent the coordinate transformation of an object (including the camera) as a 4x4 matrix - its "model matrix" or "object / local to world matrix". We can build this matrix out of columns representing the object's basis vectors and position in the world:

\$\vec r = \begin{bmatrix}r_x \\ \vec r_y\\ \vec r_z \end{bmatrix}\$ the object's "right" vector, in world space.
\$\vec u = \begin{bmatrix}u_x \\ \vec u_y\\ \vec u_z \end{bmatrix}\$ the object's "up" vector, in world space.
\$\vec f = \begin{bmatrix}f_x \\ \vec f_y\\ \vec f_z \end{bmatrix}\$ the object's "forward" vector, in world space.
\$\vec t = \begin{bmatrix}t_x \\ \vec t_y\\ \vec t_z \end{bmatrix}\$ the object's translation - the position of the object's origin / pivot, in world space.

We combine these with an extra row like so:

$$ M = \begin{bmatrix} \vec r & \vec u & \vec f & \vec t\\ 0 & 0 & 0 & 1 \end{bmatrix} $$$$ M = \begin{bmatrix} \vec r & \vec u & \vec f & \vec t\\ 0 & 0 & 0 & 1 \end{bmatrix} = \begin{bmatrix} r_x & u_x & f_x & t_x\\ r_y & u_y & f_y & t_y\\ r_z & u_z & f_z & t_z\\ 0 & 0 & 0 & 1 \end{bmatrix} $$

(Here I'm showing one popular, left-handed coordinate convention, assuming a column vector will be multiplied on the right. You'll see other convenctions in use too)

With this matrix in hand, transforming a point from an object's local space into world space is just a matrix-vector multiplication: \$\vec v_\text{world} = M \vec v_\text{local}\$

Here \$\vec v = \begin{bmatrix}x\\y\\z\\1\end{bmatrix}\$ for a point (translation applies),

...or \$ \vec v = \begin{bmatrix}x\\y\\z\\0\end{bmatrix}\$ for a direction/displacement (no translation)

The view transform, mapping points from world space into view space (the camera's local coordinate system), is just the inverse of the camera's local to world matrix:

$$ V = \left(M_\text{camera}\right)^{-1} = \begin{bmatrix} r_x & r_y & r_z & -\vec r \cdot \vec t \\ u_x & u_y & u_z & -\vec u \cdot \vec t \\ f_x & f_y & f_z & -\vec f \cdot \vec t \\ 0 & 0 & 0 & 1 \end{bmatrix}$$

This matrix \$V\$ is the view matrix. You can verify that multiplying this by a point \$\vec v_\text{world}\$ gives the zero vector if that point was at the camera's position (\$\vec t\$), that its x coordinate increases as the point moves parallel to the camera's right vector (\$\vec r\$), etc.

Here I'm assuming we haven't applied any scale or shear transformations to the camera, so \$\{\vec r, \vec u, \vec f\}\$ form a left-handed orthonormal basis (a pure rotation matrix, in this context), and we can just transpose the upper-left 3x3 block of the camera's local to world matrix to get its inverse.

See this answer for more on how transformation matrices in games tend to be structured.

Yes.

\$\vec r = \begin{bmatrix}r_x \\ \vec r_y\\ \vec r_z \end{bmatrix}\$ the object's "right" vector, in world space.
\$\vec u = \begin{bmatrix}u_x \\ \vec u_y\\ \vec u_z \end{bmatrix}\$ the object's "up" vector, in world space.
\$\vec f = \begin{bmatrix}f_x \\ \vec f_y\\ \vec f_z \end{bmatrix}\$ the object's "forward" vector, in world space.
\$\vec t = \begin{bmatrix}t_x \\ \vec t_y\\ \vec t_z \end{bmatrix}\$ the object's translation - the position of the object's origin / pivot, in world space.

We combine these with an extra row like so:

$$ M = \begin{bmatrix} \vec r & \vec u & \vec f & \vec t\\ 0 & 0 & 0 & 1 \end{bmatrix} $$

(Here I'm showing one popular, left-handed coordinate convention, assuming a column vector will be multiplied on the right. You'll see other convenctions in use too)

With this matrix in hand, transforming a point from an object's local space into world space is just a matrix-vector multiplication: \$\vec v_\text{world} = M \vec v_\text{local}\$

Here \$\vec v = \begin{bmatrix}x\\y\\z\\1\end{bmatrix}\$ for a point (translation applies),

...or \$ \vec v = \begin{bmatrix}x\\y\\z\\0\end{bmatrix}\$ for a direction/displacement (no translation)

The view transform, mapping points from world space into view space (the camera's local coordinate system), is just the inverse of the camera's local to world matrix:

This matrix \$V\$ is the view matrix.

See this answer for more on how transformation matrices in games tend to be structured.

Yes.

\$\vec r = \begin{bmatrix}r_x \\ \vec r_y\\ \vec r_z \end{bmatrix}\$ the object's "right" vector, in world space.
\$\vec u = \begin{bmatrix}u_x \\ \vec u_y\\ \vec u_z \end{bmatrix}\$ the object's "up" vector, in world space.
\$\vec f = \begin{bmatrix}f_x \\ \vec f_y\\ \vec f_z \end{bmatrix}\$ the object's "forward" vector, in world space.
\$\vec t = \begin{bmatrix}t_x \\ \vec t_y\\ \vec t_z \end{bmatrix}\$ the object's translation - the position of the object's origin / pivot, in world space.

We combine these with an extra row like so:

$$ M = \begin{bmatrix} \vec r & \vec u & \vec f & \vec t\\ 0 & 0 & 0 & 1 \end{bmatrix} = \begin{bmatrix} r_x & u_x & f_x & t_x\\ r_y & u_y & f_y & t_y\\ r_z & u_z & f_z & t_z\\ 0 & 0 & 0 & 1 \end{bmatrix} $$

(Here I'm showing one popular, left-handed coordinate convention, assuming a column vector will be multiplied on the right. You'll see other convenctions in use too)

With this matrix in hand, transforming a point from an object's local space into world space is just a matrix-vector multiplication: \$\vec v_\text{world} = M \vec v_\text{local}\$

Here \$\vec v = \begin{bmatrix}x\\y\\z\\1\end{bmatrix}\$ for a point (translation applies),

...or \$ \vec v = \begin{bmatrix}x\\y\\z\\0\end{bmatrix}\$ for a direction/displacement (no translation)

The view transform, mapping points from world space into view space (the camera's local coordinate system), is just the inverse of the camera's local to world matrix:

See this answer for more on how transformation matrices in games tend to be structured.

Source Link

answered Jun 11 at 12:49

DMGregory ♦

140.8k
23
257
401

Yes.

\$\vec r = \begin{bmatrix}r_x \\ \vec r_y\\ \vec r_z \end{bmatrix}\$ the object's "right" vector, in world space.
\$\vec u = \begin{bmatrix}u_x \\ \vec u_y\\ \vec u_z \end{bmatrix}\$ the object's "up" vector, in world space.
\$\vec f = \begin{bmatrix}f_x \\ \vec f_y\\ \vec f_z \end{bmatrix}\$ the object's "forward" vector, in world space.
\$\vec t = \begin{bmatrix}t_x \\ \vec t_y\\ \vec t_z \end{bmatrix}\$ the object's translation - the position of the object's origin / pivot, in world space.

We combine these with an extra row like so:

$$ M = \begin{bmatrix} \vec r & \vec u & \vec f & \vec t\\ 0 & 0 & 0 & 1 \end{bmatrix} $$

(Here I'm showing one popular, left-handed coordinate convention, assuming a column vector will be multiplied on the right. You'll see other convenctions in use too)

With this matrix in hand, transforming a point from an object's local space into world space is just a matrix-vector multiplication: \$\vec v_\text{world} = M \vec v_\text{local}\$

Here \$\vec v = \begin{bmatrix}x\\y\\z\\1\end{bmatrix}\$ for a point (translation applies),

...or \$ \vec v = \begin{bmatrix}x\\y\\z\\0\end{bmatrix}\$ for a direction/displacement (no translation)

The view transform, mapping points from world space into view space (the camera's local coordinate system), is just the inverse of the camera's local to world matrix:

This matrix \$V\$ is the view matrix.

See this answer for more on how transformation matrices in games tend to be structured.