I will also add my two cents to this, since a lot of the answers contain confusing or simply wrong information or do not go into detail on where the confusion is coming from.
Looking at memory
Let's start by ignoring row-major and column-major (and any mentions of rows/columns in the docs) and we will focus on how the GPU interprets a mat4 when it comes to multiplication by default in GLSL. We will use standard mathematical conventions.
First, in memory, the 16 entries of the matrix are layed out linearly:
[x0 x1 x2 x3 y0 y1 y2 y3 z0 z1 z2 z3 w0 w1 w2 w3]
Conceptually, this is an array of vec4 i.e.
[[x0 x1 x2 x3][y0 y1 y2 y3][z0 z1 z2 z3][w0 w1 w2 w3]]
We know that this is the case if we read the respective sections about memory layout, alignment etc.. This means that if we index the matrix, we get as a result these vectors, i.e.
m[0]=[x0 x1 x2 x3]
The GLSL 4.6 documentation states in chapter 5.10.:
vec3 v, u;
mat3 m;
u = v * m;
will be equivalent to
u.x = dot(v, m[0]); // m[0] is the left column of m
u.y = dot(v, m[1]); // dot(a,b) is the inner (dot) product of a and b
u.z = dot(v, m[2]);
(left-multiply)
and
u = m * v;
is equivalent to
u.x = m[0].x * v.x + m[1].x * v.y + m[2].x * v.z;
u.y = m[0].y * v.x + m[1].y * v.y + m[2].y * v.z;
u.z = m[0].z * v.x + m[1].z * v.y + m[2].z * v.z;
The only possible conclusion from this is that in typical mathematical notation, the matrix looks like this:
x0 y0 z0 w0 x1 y1 z1 w1 x2 y2 z2 w2 x3 y3 z3 w3
With standard matrix multiplication ("row by column"). Of course, the documentation constantly mentions that the matrix is indeed stored column-by-column and that m[i] accesses the columns of a matrix, so we can conclude that the OpenGL docs follow standard mathematical notation even if we mistrust them.
Thus, GLSL treats every single matrix as a column-major matrix by default. As far as I am aware, HLSL uses row-major by default. New versions of GLSL support layout-qualifiers, allowing you to explicitly specify a matrix as row-major, which will cause it to be interpreted accordingly.
Concrete example: Translation matrix
Now to the example using a translation matrix. Mathematically it looks like this:
1 0 0 Tx 0 1 0 Ty 0 0 1 Tz 0 0 0 1
Which means that if you want to perform a translation using a 4x4 matrix in a GLSL shader the matrix has to have the layout
[1 0 0 0 0 1 0 0 0 0 1 0 Tx Ty Tz 1]
I think so far, this is pretty obvious. We understand how everything works now in the column-major world.
In column-major we usually see M * v whereas row-major often uses v * M.
GLSL implicitly treats v as a row-vector in the second case and we've previously established that the result consists of the dot-product with the columns of M. This means that M has to be transposed for this calculation to be correct which is probably obvious to anyone familiar with matrix multiplication. Note that "reading" a row-major matrix as column-major (and vice-versa) IS a transposition, because what you treat as columns are actually the rows.
Thus, as far as OpenGL is concerned, if you have row-major matrices client-side, you can
Transpose them before upload and use M * v (i.e. via transpose argument of glUniformMatrix)
Transpose them in shader as transpose(M) * v
Mark them using layout(row_major) and use M * v
Upload them as-is and use v * M (and reverse multiplication order of matrices in the shader). It'll be read column-major, which transposes it and makes this calculation correct.
The case in the question
OP was using row-major matrices client-side and was confused about why the transpose argument needed to be false for the transformations to work correctly. Assuming that OP stated the memory layout faithfully, my conclusion is that OP used v * M in the shader, which would make this correct. Alternatively it is quite possible that OP was just confused about the layout in the library.
Nowadays, the docs for the transpose argument read
Specifies whether to transpose the matrix as the values are loaded into the uniform variable. Must be GL_FALSE.
Which makes no statements about row-major or column-major. This is most likely due to the addition of layout(row_major), where the old definition as stated in the old post by OP would make no sense anymore.
Why everyone is confused
In the "row-major" world (i.e. HLSL), people have become used to the convention of using v * M instead of M * v. If the shader would read the row-major matrix as column-major, the fourth bullet point from above would apply and everything would be fine.
However, HLSL assumes row-major layout by default. If we want to be calculating v * M using row-major matrices that are also read row-major in the shader then this can only work if M is already transposed.
For this reason, you'll see a translation-matrix on the client-side defined as:
1, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0, Tx, Ty, Tz, 1 Linear: [1 0 0 0 0 1 0 0 0 0 1 0 Tx Ty Tz 1]
Thus the matrix has the same linear layout as a column-major translation matrix, because in row-major, this is a transposed translation matrix (the translation bit is in the last row).
And that is why this topic is so confusing. For some reason, some people thought this would be a good convention to have, where you are basically hiding the fact that behind all of those variable names for your fancy transformation matrices, you actually have a transposed matrix. That is of course also why in this type of "framework", matrix multiplication has to be in the reverse order in shader AND client-side.
TL;DR
What matters is how the linear memory is interpreted. GLSL default is column-major, HLSL default is row-major. Modern shader languages let you specify it.
If the client-side layout does not match what the shader thinks the layout is, your matrix will behave as a transposed matrix
If you have client-side row-major matrices in GLSL, your options are
Transpose them before upload and use M * v (i.e. via transpose argument of glUniformMatrix)
Transpose them manually in shader as transpose(M) * v
Mark them using layout(row_major) and use M * v
Upload them as-is and use v * M (and reverse multiplication order of matrices in the shader). It'll be read column-major, which transposes it and makes this calculation correct.
In HLSL, people for some reason like to use v * M with row-major matrices on client-side and shader-side, which means that these people have to create all their matrices in transposed form implicitly. They have to reverse multiplication order due to this. Another consequence is that these row-major matrices have the same memory layout as a non-transposed column-major matrix, which is why some people made that claim in their answers.
GL_FALSE?" How are you using the uniform?