3

I am learning OpenGL and the tutorials (1, 2) I'm reading teach me that to scale/rotate/translate an object you have to know matrix multiplication. Why? Instead of 3x3 matrix you can use 6 floats: scale_x, scale_y, sin_a, cos_a, mov_x and mov_y. They would result in...

  • Less GPU and CPU usage (matrix multiplication takes 9 multiplications and 6 additions while individual variables need 6 multiplications and 4 additions. Also, if you want to move object, instead of having to multiply whole matrix you get to only change values of two floats.
  • Less memory bandwidth (1/3 cut!)
  • Easier to manipulate (at least in C, no need for matrix manipulation functions)
  • Easier information extraction to get object rotation or offset or scale

The only disadvantages that I know of are:

  • You cannot change the order of transformations, most importantly, rotation around some point, not the origin of the coordinate plain. You can simulate it, though, by altering mov_x and mov_y accordingly.
  • More code. You could put them into an array though, but that would mean you would lose the ability to update them one by one and only could update them all or the beginning of array, not the middle or the end. I could be wrong on this, though.

Could someone tell me what am I missing?

Edit: added my shader below.

#version 330 layout (location = 0) in vec2 pos; layout (location = 1) in vec3 clr_in; uniform float osclx, oscly, osina, ocosa, omovx, omovy; uniform float msclx, mscly, msina, mcosa, mmovx, mmovy; float oldx, oldy, x, y; out vec3 clr_out; void main() { x = pos.x; y = pos.y; x = x * oscly; y = y * osclx; oldx = x; oldy = y; x = oldx * ocosa - oldy * osina; y = oldx * osina + oldy * ocosa; x = x + omovx; y = y + omovy; x = x * msclx; y = y * mscly; oldx = x; oldy = y; x = oldx * mcosa - oldy * msina; y = oldx * msina + oldy * mcosa; x = x + mmovx; y = y + mmovy; gl_Position = vec4(x, y, 0.0f, 1.0f); clr_out = clr_in; } 
7
  • Are you talking about 3d or 2d transformations? Your expression using scale_x, sin_a, etc. only covers scaling, translation, and rotation in a 2-d plane. It's missing a 'z' coordinate. For scaling, translation, and rotation in 3-d you indeed need a 3x3 matrix. Commented Oct 17, 2015 at 17:20
  • @CharlesE.Grant I am talking about 2D transformations, but I'm guessing that the same points would be valid under 3D circumstances. Correct me if I'm wrong. Commented Oct 17, 2015 at 17:28
  • That may be what's confusing you. The tutorials you were looking at seem to be doing 3d transforms (note the Z coordinate). For 2d transforms your expression is correct, but 2d transforms are described by 2x2 matrixes, so you end up with the same number of arithmetic operations as with your expression. The geometry is the geometry, matrix multiplication is just a concise way to write it. Commented Oct 17, 2015 at 17:35
  • 1
    You're incorrect in thinking you can do less with non-matrix math. Matrices are ultimately just a way of ordering the required informarion. If you use '6 floats' you'll just end up reinventing the operations provided by the matrix model. You're forgetting about the z axis. Commented Oct 18, 2015 at 5:30
  • 1
    See how your shader does 2 transformations? If you used the idea of matrices, you could combine those 2 transformations into 1 transformation and the shader would have to do half as much work! Commented May 17, 2022 at 11:18

4 Answers 4

5

While I think the upsides you provided can be true, there is one particular upside for matrices that I think matters a lot: almost all transforms that one wants to do in graphics - scaling, rotation, translation, etc. - can be represented as matrix multiplication. It's nice to have a common format - an interface, if you will. Additionally, compositions of operations can also be represented as a single matrix. So if you want to do a scaling, a rotation, a scaling, a translation, and a rotation - it's still just one matrix. That's a pretty big win. That's not to say the individual transform types may not be preserved "on the side" in your hierarchy of objects being rendered, etc. - being able to adjust the "mov_x" is indeed valuable - but the graphics pipeline tends to depend on matrices internally.

1
  • 1
    Transformations expressed at matrix operations can be composed. So you multiply some matrices together to get an overall transform, then multiply all your hundreds ... thousands ... whatever of points/tris by that matrix. Far fewer operations than doing separate transforms on each point/tri - also - structures the code better in terms of what you're doing. Commented May 17, 2022 at 0:10
5

There're several good reasons for representing transforms as matrices:

  1. Vectors are used a lot in 3D graphics;
  2. Matrices are efficient data structures to store multiple vectors;
  3. The modern graphics pipeline is shader-based and requires the client application to send data quickly (e.g. buffers);
  4. The client can use a single function call to send the transform matrix to a single uniform in the vertex shader, while your proposed approach would require multiple independent vertex uniforms and the client to issue multiple function calls before reaching something like glDrawArrays or equivalent --which keeps the GPU idle for longer and makes it less efficient
  5. The math is simpler to perform and less error prone than your proposed approach (compare your complicated shader there vs something simpler like gl_Position = projection_matrix * view_matrix * model_matrix * position_vector;)
  6. With matrices, you can accumulate transforms into a single matrix and reduce the number of math operations significantly in many cases (e.g. gl_Position = pvm_matrix * position_vector;)

Ultimately, I think it's fallacious to think that you're being more efficient. The math operations to translate/rotate/scale will be the same regardless of how you look at it, and your code looks significantly more complex than it needs to be. Why should you (or worse, those maintaining your code) have to keep track of 16 uniforms for a 4x4 matrix (and make OpenGL calls to send data for each one) when you could only keep track of 1 uniform and have a single OpenGL call per matrix? I wouldn't want to work on that kind of code.

Also, a lot of the efficiency in 3D graphics comes from the GPU's parallelism and its ability to consume data quickly. The hardware is already optimized to handle certain common cases, and using matrices as a data structure to store transformations is so common that they have their own data types built into GLSL (e.g. mat4). It's like asking "Why are 3D models always built using triangles when I could use _____ geometric shape instead?". Because there're many advantages to using triangles[1] and the hardware is already optimized for them.

[1] For example, triangles being co-planar allow some graphics algorithms to be simpler and not worry about corner cases (e.g. clipping, etc)

0
  • Simplicity. Lots of code is tedious and writing shaders requires flexibility. I have done the optimisation you have just shown in the past and all it did was hurt me and give me headaches.

  • Parallelism. I'm not saying the GPU does it, but it definitly gives an opportunity to optimise it. With all theses instructions the GPU has no idea what you're doing and will never be able to do any optimisation on its own.

Keep in mind the GPU is a number-crushing beast designed for lazily-written AAA engines, and is rigged against such homemade optimisations. Changing that is not yet in the agenda of the manufacturers or even anyone because the technology is still evolving and the standards are constantly rewritten.

Just dump in the matrix and let the machine cheat a bit for you.

0

You think matrix operations would take more instructions. Think again. All these matrix operations go through an optimising compiler (often through Clang at highest optimisation levels), and that compiler will know exactly what matrix entries are zero, one or minus one, or some small constant, and will produce optimal code with the minimum number of operations.

And the compiler will know the architecture of your graphics card in and out and will produce optimal code for it, most likely better than anything you could write by hand.