Skip to main content
1 of 2
DMGregory
  • 140.8k
  • 23
  • 257
  • 401

Yes.

In games, we'll typically represent the coordinate transformation of an object (including the camera) as a 4x4 matrix - its "model matrix" or "object / local to world matrix". We can build this matrix out of columns representing the object's basis vectors and position in the world:

  • \$\vec r = \begin{bmatrix}r_x \\ \vec r_y\\ \vec r_z \end{bmatrix}\$ the object's "right" vector, in world space.

  • \$\vec u = \begin{bmatrix}u_x \\ \vec u_y\\ \vec u_z \end{bmatrix}\$ the object's "up" vector, in world space.

  • \$\vec f = \begin{bmatrix}f_x \\ \vec f_y\\ \vec f_z \end{bmatrix}\$ the object's "forward" vector, in world space.

  • \$\vec t = \begin{bmatrix}t_x \\ \vec t_y\\ \vec t_z \end{bmatrix}\$ the object's translation - the position of the object's origin / pivot, in world space.

We combine these with an extra row like so:

$$ M = \begin{bmatrix} \vec r & \vec u & \vec f & \vec t\\ 0 & 0 & 0 & 1 \end{bmatrix} $$

(Here I'm showing one popular, left-handed coordinate convention, assuming a column vector will be multiplied on the right. You'll see other convenctions in use too)

With this matrix in hand, transforming a point from an object's local space into world space is just a matrix-vector multiplication: \$\vec v_\text{world} = M \vec v_\text{local}\$

Here \$\vec v = \begin{bmatrix}x\\y\\z\\1\end{bmatrix}\$ for a point (translation applies),

...or \$ \vec v = \begin{bmatrix}x\\y\\z\\0\end{bmatrix}\$ for a direction/displacement (no translation)

The view transform, mapping points from world space into view space (the camera's local coordinate system), is just the inverse of the camera's local to world matrix:

$$ V = \left(M_\text{camera}\right)^{-1} = \begin{bmatrix} r_x & r_y & r_z & -\vec r \cdot \vec t \\ u_x & u_y & u_z & -\vec u \cdot \vec t \\ f_x & f_y & f_z & -\vec f \cdot \vec t \\ 0 & 0 & 0 & 1 \end{bmatrix}$$

This matrix \$V\$ is the view matrix.

Here I'm assuming we haven't applied any scale or shear transformations to the camera, so \$\{\vec r, \vec u, \vec f\}\$ form a left-handed orthonormal basis (a pure rotation matrix, in this context), and we can just transpose the upper-left 3x3 block of the camera's local to world matrix to get its inverse.

See this answer for more on how transformation matrices in games tend to be structured.

DMGregory
  • 140.8k
  • 23
  • 257
  • 401